Lookahead vs lookbehind, explained

Published 2026-05-15 · 9-min read

Lookarounds are the regex feature that confuses people for the longest. They’re also the feature that, once it clicks, makes a lot of previously-painful patterns trivial. This guide shows you what each variety does, when to reach for them, and which engines you can’t rely on them in.

The four kinds, in one table

SyntaxNameAssertsConsumes characters?
(?=...)Positive lookaheadWhat follows matches the inner patternNo
(?!...)Negative lookaheadWhat follows does NOT match the inner patternNo
(?<=...)Positive lookbehindWhat precedes matches the inner patternNo
(?<!...)Negative lookbehindWhat precedes does NOT match the inner patternNo

Every lookaround is a zero-width assertion— it checks a condition about what comes next or before, but doesn’t advance the cursor. The match doesn’t include any of the “looked-at” characters.

When you actually need lookaround

Most of the time you don’t. If you can express your pattern with normal capture groups, do it — it’s more portable. Lookarounds shine in three situations:

1. Multi-condition validation

A password must contain a digit AND an uppercase letter AND a symbol. Without lookahead, you’d need to express every ordering. With lookahead, each condition is one assertion:

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[\W_])[A-Za-z\d\W_]{12,}$

Each (?=.*X)asserts “somewhere ahead in the string, X exists.” Because lookaheads are zero-width, all four checks anchor at the same position (the start). The final character class then does the actual matching. This pattern is the canonical implementation of the “complex password” rule.

2. Splitting on a delimiter without consuming it

Say you want to split a string at every ;that’s followed by a digit:

"a;1b;c;2d".split(/;(?=\d)/) === ["a", "1b;c", "2d"]

The lookahead (?=\\d) requires a digit after the ;but doesn’t eat it — so the 1 and 2stay in the resulting substrings. Without the lookahead you’d capture-and-rejoin, which is more code.

3. Inserting text at a position

Insert a comma every three digits, right-to-left, in a number:

"1234567".replace(/\B(?=(\d{3})+(?!\d))/g, ",") === "1,234,567"

Read it: at every position that’s not a word boundary, look ahead for groups of three digits that are not followed by another digit. The \\B + lookahead identifies a position; the replacement inserts the comma there without consuming surrounding digits.

Negative lookarounds are how you say “except”

Negative variants are most useful for “match X, but not when followed/preceded by Y.”

Find “cat” but not when it’s part of “catalog”:

/\bcat\b(?!alog)/

Find function declarations but not function inside strings (rough heuristic; for real code use a parser):

/(?<!["'\\`])function\s+\w+/g

Both are read aloud as “match the main thing, AS LONG AS the lookaround condition is satisfied.”

What about engine support?

This is the part that ruins patterns at deployment time.

  • JavaScript — supports all four since ES2018. Variable-length lookbehind works.
  • PCRE2 / PHP / nginx — full support, variable-length lookbehind.
  • Python re — fixed-length lookbehind only; for variable length, install the third-party regex module.
  • Java — historically fixed-length lookbehind; bounded variable-length added in Java 9+, fully variable in Java 13+.
  • .NET — full support.
  • Go regexp (RE2)no lookaround at all. DFA engines can’t do them without breaking linear-time.
  • Rust regex crate — same as Go, no lookaround. Add the fancy-regex crate if you need them.
  • POSIX ERE (grep -E, awk) — no lookaround.

If you’re writing patterns that need to run in Go or Rust services, you can’t use lookaround. Often you can rewrite to use atomic groups, possessive quantifiers, or a two-pass approach. Sometimes you can’t and have to switch engines.

The performance gotcha

Lookarounds aren’t free on NFA engines. A poorly-placed lookahead can cause catastrophic backtracking — the engine retries the inner assertion at every position.

Anti-pattern:

/^(.+)+(?=\d)/

The nested (.+)+ alone is catastrophic; the lookahead adds another retry per failed split. On a long input without a trailing digit, this runs effectively forever. The fix is to anchor properly and avoid the nested quantifier — see ReDoS prevention.

Common idioms worth memorizing

These come up enough that it’s worth knowing the shape:

  • “Must contain X”(?=.*X) as the first anchor.
  • “Word but not in this context”\\bword\\b(?!Y) or (?<!Y)\\bword\\b.
  • “Split without consuming” split(/X(?=Y)/) or split(/(?<=X)Y/).
  • “Match X up to but not including Y” X.*?(?=Y) with a lazy quantifier.

FAQ

Why are lookahead and lookbehind called 'zero-width'?

Because they assert a condition about what comes next or behind, but they don't consume any characters — the regex engine's cursor doesn't move past them. The match doesn't include the looked-at characters.

Which engines support lookbehind?

Most modern NFA engines: JavaScript (since ES2018), PCRE, Python re, Java, .NET, Ruby Onigmo. Go's RE2 and Rust's regex crate do NOT — they're DFA-based and lookbehind would break their linear-time guarantee.

What's the difference between variable-length and fixed-length lookbehind?

Some engines historically required fixed-length lookbehind because the engine had to scan backward by a known offset. JavaScript and PCRE2 support variable-length lookbehind; Java and .NET historically required fixed-length. Always check your target engine before relying on (?<=\d+).

Are lookahead and lookbehind slow?

On NFA engines, a well-anchored lookahead is fast and avoids creating a capture group. Unanchored lookarounds inside greedy contexts can cause backtracking explosions — see our ReDoS guide. On RE2 / Rust, they're unsupported so the question doesn't apply.

Try them

Lookarounds are easiest to internalize by playing with them. The AI Regex Toolkit shows live matches, lets you toggle flags, and gives you a token-by-token explanation when you paste a pattern. The strong-password entry in the library shows the four-lookahead pattern in context.

Up next: ReDoS prevention — what happens when a lookaround (or any quantifier) is positioned wrong.