Lookahead vs lookbehind, explained
Published 2026-05-15 · 9-min read
Lookarounds are the regex feature that confuses people for the longest. They’re also the feature that, once it clicks, makes a lot of previously-painful patterns trivial. This guide shows you what each variety does, when to reach for them, and which engines you can’t rely on them in.
The four kinds, in one table
| Syntax | Name | Asserts | Consumes characters? |
|---|---|---|---|
(?=...) | Positive lookahead | What follows matches the inner pattern | No |
(?!...) | Negative lookahead | What follows does NOT match the inner pattern | No |
(?<=...) | Positive lookbehind | What precedes matches the inner pattern | No |
(?<!...) | Negative lookbehind | What precedes does NOT match the inner pattern | No |
Every lookaround is a zero-width assertion— it checks a condition about what comes next or before, but doesn’t advance the cursor. The match doesn’t include any of the “looked-at” characters.
When you actually need lookaround
Most of the time you don’t. If you can express your pattern with normal capture groups, do it — it’s more portable. Lookarounds shine in three situations:
1. Multi-condition validation
A password must contain a digit AND an uppercase letter AND a symbol. Without lookahead, you’d need to express every ordering. With lookahead, each condition is one assertion:
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[\W_])[A-Za-z\d\W_]{12,}$Each (?=.*X)asserts “somewhere ahead in the string, X exists.” Because lookaheads are zero-width, all four checks anchor at the same position (the start). The final character class then does the actual matching. This pattern is the canonical implementation of the “complex password” rule.
2. Splitting on a delimiter without consuming it
Say you want to split a string at every ;that’s followed by a digit:
"a;1b;c;2d".split(/;(?=\d)/) === ["a", "1b;c", "2d"]The lookahead (?=\\d) requires a digit after the ;but doesn’t eat it — so the 1 and 2stay in the resulting substrings. Without the lookahead you’d capture-and-rejoin, which is more code.
3. Inserting text at a position
Insert a comma every three digits, right-to-left, in a number:
"1234567".replace(/\B(?=(\d{3})+(?!\d))/g, ",") === "1,234,567"Read it: at every position that’s not a word boundary, look ahead for groups of three digits that are not followed by another digit. The \\B + lookahead identifies a position; the replacement inserts the comma there without consuming surrounding digits.
Negative lookarounds are how you say “except”
Negative variants are most useful for “match X, but not when followed/preceded by Y.”
Find “cat” but not when it’s part of “catalog”:
/\bcat\b(?!alog)/Find function declarations but not function inside strings (rough heuristic; for real code use a parser):
/(?<!["'\\`])function\s+\w+/gBoth are read aloud as “match the main thing, AS LONG AS the lookaround condition is satisfied.”
What about engine support?
This is the part that ruins patterns at deployment time.
- JavaScript — supports all four since ES2018. Variable-length lookbehind works.
- PCRE2 / PHP / nginx — full support, variable-length lookbehind.
- Python
re— fixed-length lookbehind only; for variable length, install the third-partyregexmodule. - Java — historically fixed-length lookbehind; bounded variable-length added in Java 9+, fully variable in Java 13+.
- .NET — full support.
- Go
regexp(RE2) — no lookaround at all. DFA engines can’t do them without breaking linear-time. - Rust
regexcrate — same as Go, no lookaround. Add thefancy-regexcrate if you need them. - POSIX ERE (
grep -E,awk) — no lookaround.
If you’re writing patterns that need to run in Go or Rust services, you can’t use lookaround. Often you can rewrite to use atomic groups, possessive quantifiers, or a two-pass approach. Sometimes you can’t and have to switch engines.
The performance gotcha
Lookarounds aren’t free on NFA engines. A poorly-placed lookahead can cause catastrophic backtracking — the engine retries the inner assertion at every position.
Anti-pattern:
/^(.+)+(?=\d)/The nested (.+)+ alone is catastrophic; the lookahead adds another retry per failed split. On a long input without a trailing digit, this runs effectively forever. The fix is to anchor properly and avoid the nested quantifier — see ReDoS prevention.
Common idioms worth memorizing
These come up enough that it’s worth knowing the shape:
- “Must contain X” —
(?=.*X)as the first anchor. - “Word but not in this context” —
\\bword\\b(?!Y)or(?<!Y)\\bword\\b. - “Split without consuming” —
split(/X(?=Y)/)orsplit(/(?<=X)Y/). - “Match X up to but not including Y” —
X.*?(?=Y)with a lazy quantifier.
FAQ
Why are lookahead and lookbehind called 'zero-width'?
Because they assert a condition about what comes next or behind, but they don't consume any characters — the regex engine's cursor doesn't move past them. The match doesn't include the looked-at characters.
Which engines support lookbehind?
Most modern NFA engines: JavaScript (since ES2018), PCRE, Python re, Java, .NET, Ruby Onigmo. Go's RE2 and Rust's regex crate do NOT — they're DFA-based and lookbehind would break their linear-time guarantee.
What's the difference between variable-length and fixed-length lookbehind?
Some engines historically required fixed-length lookbehind because the engine had to scan backward by a known offset. JavaScript and PCRE2 support variable-length lookbehind; Java and .NET historically required fixed-length. Always check your target engine before relying on (?<=\d+).
Are lookahead and lookbehind slow?
On NFA engines, a well-anchored lookahead is fast and avoids creating a capture group. Unanchored lookarounds inside greedy contexts can cause backtracking explosions — see our ReDoS guide. On RE2 / Rust, they're unsupported so the question doesn't apply.
Try them
Lookarounds are easiest to internalize by playing with them. The AI Regex Toolkit shows live matches, lets you toggle flags, and gives you a token-by-token explanation when you paste a pattern. The strong-password entry in the library shows the four-lookahead pattern in context.
Up next: ReDoS prevention — what happens when a lookaround (or any quantifier) is positioned wrong.