Overview
An introduction to the basics of regular expressions. This post summarizes commonly used syntax in ERE (Extended Regular Expressions).
Character Classes
[Characters]
- Matches any single character within the brackets.
- Ex:
- [きつね]
- きつねたぬきねこ
- き,つ,ね,き,ね
- きつねたぬきねこ
- [きつね]
[^Characters]
- Matches any single character not within the brackets.
- Ex:
- [^きつね]
- きつねたぬきねこ
- た,ね,こ
- きつねたぬきねこ
- [^きつね]
[Character-Character]
- Matches any single character within the specified range.
- Ex:
- [あ-ん]
- きつねたぬきcat
- き,つ,ね,た,ぬ,き
- きつねたぬきcat
- [あ-ん]
\d
- Matches any decimal digit.
- Ex:
- \d
- りんごが10個
- 1, 0
- りんごが10個
- \d
\D
- Matches any character that is not a decimal digit.
- Ex:
- \D
- りんごが10個
- り,ん,ご,が,個
- りんごが10個
- \D
\w
- Matches any alphanumeric character or underscore.
- Ex:
- \w
- abc_*
- a,b,c,_
- abc_*
- \w
\W
- Matches any character that is not alphanumeric or an underscore.
- Ex:
- \W
- abc_*
-
- abc_*
- \W
\s
- Matches any whitespace character.
- Ex:
- \s
- a b c
- Matches two spaces (between a and b, and b and c).
- a b c
- \s
\S
- Matches any non-whitespace character.
- Ex:
- \S
- a b c
- a,b,c
- a b c
- \S
Anchors
^
- Matches the beginning of a line.
- Ex:
- ^ありがとう
- ありがとう友よ
- 昨日はありがとう ✗
- ^ありがとう
$
- Matches the end of a line.
- Ex:
- ありがとう$
- ありがとう友よ ✗
- 昨日はありがとう ○
- ありがとう$
Grouping Constructs
(Subexpression)
- Captures the substring that matches the subexpression.
- Ex:
- (りり){2}
- ありりりりがとう ○
- ありりがとう ✗
- (りり){2}
Quantifiers
*
- Matches the preceding element zero or more times (greedy).
- Ex:
- ab*
- ab
- ab
- abab
- ab, ab
- aabb
- ab
- abbb
- abbb
- a
- a
- ba
- a
- ab
- ab*
+
- Matches the preceding element one or more times (greedy).
- Ex:
- ab+
- ab
- ab
- abab
- ab,ab
- aabb
- abb
- abbb
- abbb
- a
- No match
- ba
- No match
- ab
- ab+
?
- Matches the preceding element zero or one time (greedy).
- Ex:
- ab?
- ab
- ab
- abab
- ab,ab
- aabb
- a, ab
- abbb
- ab
- a
- a
- ba
- a
- ab
- ab?
*?
- Matches the preceding element zero or more times (lazy).
- Ex:
- ab*?
- ab
- a
- abab
- a,a
- aabb
- a,a
- abbb
- a
- a
- a
- ba
- a
- ab
- ab*?
+?
- Matches the preceding element one or more times (lazy).
- Ex:
- ab+?
- ab
- ab
- abab
- ab,ab
- aabb
- ab
- abbb
- ab
- a
- No match
- ba
- No match
- ab
- ab+?
??
- Matches the preceding element zero or one time (lazy).
- Ex:
- ab??
- ab
- a
- abab
- a,a
- aabb
- a, a
- abbb
- a
- a
- a
- ba
- a
- ab
- ab??
{n}
- Matches the preceding element exactly n times.
- Ex:
- b{2}
- abba
- bb
- abba
- b{2}
{n,}
- Matches the preceding element n or more times.
- Ex:
- b{2,}
- abbba
- bbb
- abbba
- b{2,}
{n, m}
- Matches the preceding element between n and m times.
- Ex:
- b{2,4}
- abbba
- bbb
- abbbba
- bbbb
- abbba
- b{2,4}
Alternation Constructs
|
- Matches any one of the separated alternatives.
- Ex:
- ab|cd
- abcd
- ab,cd
- aaccd
- cd
- abcd
- ab|cd