Lookahead and Lookbehind: Matching by Context
Sometimes you want to match something only because of what surrounds it, without making that context part of the match. A price, but only the digits after the currency sign. A word, but only if a certain word does not follow. Lookarounds do exactly this: they test the text next to the current position and then step back, matching nothing themselves. Try each example in the Regex Tester to see what is and is not included in the match.
Try the Regex Tester toolTest a regular expression against your text in real time. See every match highlighted, with capture groups and flags. Runs entirely in your browser.Zero-width: assertions that match nothing
A lookaround is a zero-width assertion. It looks at the characters ahead of or behind the current position, succeeds or fails based on what it sees, and then leaves the position unchanged. Because it consumes no characters, the text it inspected does not become part of the result. That is the whole trick, and it is what makes lookarounds so useful for matching by context.
The four lookarounds
| Form | Name | Succeeds when |
|---|---|---|
| (?=...) | Positive lookahead | the text ahead matches |
| (?!...) | Negative lookahead | the text ahead does not match |
| (?<=...) | Positive lookbehind | the text behind matches |
| (?<!...) | Negative lookbehind | the text behind does not match |
Lookahead: assert what follows
Positive lookahead shines for checking several independent conditions at one position, which is how password rules are usually expressed. Each lookahead scans ahead for one requirement, then returns to the start so the next can do the same.
^(?=.*[A-Z])(?=.*\d).{8,}$
"asserts: somewhere ahead there is an uppercase letter,
somewhere ahead there is a digit, and the whole thing
is at least 8 characters"Negative lookahead is the opposite: match here only if something does not come next. The pattern \bcat\b(?! food) matches the word cat unless it is followed by the word food, so cat matches but cat food does not.
Lookbehind: assert what precedes
Lookbehind checks the text just before the current position. It is the clean way to grab a value that follows a marker without capturing the marker itself.
(?<=\$)\d+(?:\.\d{2})? matches 19.99 in "$19.99", but not the $
(?<!\$)\b\d+ matches a number that is NOT preceded by $Compare the first with a capturing group. You could write \$(\d+) and read group 1, but then the dollar sign is part of the overall match. With lookbehind, the match is just the number, which is often exactly what a find-and-replace or an extraction needs.
A classic: grouping digits with commas
One well-known recipe uses lookahead to insert thousands separators. The pattern finds every position that has a multiple of three digits remaining to its right, and a replacement inserts a comma there.
pattern: \B(?=(\d{3})+(?!\d))
replace: ,
1234567 -> 1,234,567See the match boundaries
Lookarounds are far easier to understand when you can see precisely which characters end up highlighted. The Regex Tester shows the exact match in your browser as you type, so the zero-width nature of an assertion becomes obvious. MDN documents the four forms in its reference on assertions.
Test a lookaround nowTest a regular expression against your text in real time. See every match highlighted, with capture groups and flags. Runs entirely in your browser.Related articles
Capturing Groups, Named Groups, and Backreferences
How parentheses capture parts of a match, how to name them, how backreferences match repeated text, and how to use all three when replacing.
Regex Basics: Characters, Classes, Anchors, and Quantifiers
The core building blocks of regular expressions: literal characters, character classes, anchors, quantifiers, and the flags that change everything.
Greedy, Lazy, and Catastrophic Backtracking
Why quantifiers grab as much as they can, how lazy matching reins them in, and how nested quantifiers cause the ReDoS slowdown to avoid.