Regex Syntax

⌕

anchorsAnchors4 tokens

Start of string (or line in multiline mode).

/^hello/→ matches hello at the start

End of string (or line in multiline mode).

/world$/→ matches world at the end

Word boundary — position between a word character and a non-word character.

/\bcat\b/→ matches cat, not concatenate

Non-word boundary — the inverse of \b.

/\Bcat\B/→ matches concatenate, not cat

classesCharacter Classes10 tokens

Any character except newline (unless s/dotAll flag is set).

/c.t/→ matches cat, cut, c3t

[abc]

Character set — matches any one of the listed characters.

/[aeiou]/→ matches any vowel

[^abc]

Negated character set — matches any character NOT in the set.

/[^aeiou]/→ matches any non-vowel

[a-z]

Character range — matches any character between a and z (inclusive).

/[a-z]/→ matches any lowercase letter

Digit — equivalent to [0-9].

/\d+/→ matches 42, 007

Non-digit — equivalent to [^0-9].

/\D+/→ matches abc, hello

Word character — equivalent to [a-zA-Z0-9_].

/\w+/→ matches hello_world

Non-word character — equivalent to [^a-zA-Z0-9_].

/\W+/→ matches spaces and punctuation

Whitespace — space, tab, newline, carriage return, form feed.

/\s+/→ matches spaces and tabs

Non-whitespace — any character that isn't whitespace.

/\S+/→ matches non-space tokens

quantifiersQuantifiers9 tokens

Zero or more — greedy, matches as many as possible.

/ab*/→ matches a, ab, abb, abbb

One or more — greedy.

/ab+/→ matches ab, abb (not a)

Zero or one — makes the preceding token optional.

/colou?r/→ matches color and colour

{n}

Exactly n repetitions.

/\d{4}/→ matches exactly 4 digits

{n,}

n or more repetitions.

/\d{2,}/→ matches 2 or more digits

{n,m}

Between n and m repetitions (inclusive).

/\d{2,4}/→ matches 2, 3, or 4 digits

Lazy zero or more — matches as few as possible.

/<.*?>/s→ matches shortest tag

Lazy one or more.

/\d+?/→ matches minimal digits

Lazy zero or one.

/colou??r/→ prefers color over colour

groupsGroups & Capturing6 tokens

(abc)

Capturing group — captures matched text and assigns a numbered backreference.

/(\w+)@(\w+)/→ captures username and domain separately

(?:abc)

Non-capturing group — groups without capturing. Useful for applying quantifiers without saving the match.

/(?:ab)+/→ matches ababab without capturing

(?<name>abc)

Named capturing group — like a capturing group but accessed by name.

/(?<year>\d{4})-(?<month>\d{2})/→ groups.year, groups.month

Backreference — matches the same text captured by group 1.

/(\w+) \1/→ matches repeated words like the the

\k<name>

Named backreference — matches the text captured by the named group.

/(?<q>['"]).*?\k<q>/→ matches matching quote pairs

a|b

Alternation — matches either a or b.

/cat|dog/→ matches cat or dog

lookaroundLookahead & Lookbehind4 tokens

(?=abc)

Positive lookahead — asserts that what follows matches abc, without consuming it.

/\d+(?= dollars)/→ matches 100 in '100 dollars'

(?!abc)

Negative lookahead — asserts that what follows does NOT match abc.

/\d+(?! dollars)/→ matches numbers not followed by dollars

(?<=abc)

Positive lookbehind — asserts that what precedes matches abc.

/(?<=\$)\d+/→ matches digits after $

(?<!abc)

Negative lookbehind — asserts that what precedes does NOT match abc.

/(?<!\$)\d+/→ matches digits not preceded by $

flagsFlags6 tokens

Global — find all matches, not just the first.

/abc/g→ returns all matches

Case-insensitive — makes the pattern case-insensitive.

/hello/i→ matches Hello, HELLO, hello

Multiline — ^ and $ match start/end of each line, not just the string.

/^\w+/m→ matches first word of every line

DotAll — makes . match newline characters too.

/foo.bar/s→ matches across newlines

Unicode — enables full Unicode matching and disallows some ambiguous patterns.

/\u{1F600}/u→ correctly matches emoji code points

Sticky — anchors the match to lastIndex; only matches at that exact position.

/\d+/y→ matches only at current position

escapesEscape Sequences7 tokens

Newline (line feed, LF).

/line1\nline2/→ matches across a newline

Carriage return (CR). Windows line endings are \r\n.

/\r\n/→ matches Windows line ending

Horizontal tab.

/\t/→ matches tab character

Null character (NUL, U+0000).

/\0/→ matches null byte

\xhh

Hexadecimal escape — matches the character with hex code hh.

/\x41/→ matches A (0x41)

\uhhhh

Unicode escape — matches the character with the given 4-digit hex code point.

/\u00E9/→ matches é

\u{hhhh}

Extended Unicode escape (requires u flag) — supports code points above U+FFFF.

/\u{1F600}/u→ matches 😀