ReDoS and catastrophic backtracking — how to spot and stop it

Published 2026-05-15 · 11-min read

On July 2, 2019, Cloudflare went down for 27 minutes worldwide. The root cause was a single regex in a Web Application Firewall rule. Against attacker-shaped input, the regex consumed enough CPU on every server in the global fleet to wedge HTTP processing. That class of bug — Regular Expression Denial of Service, or ReDoS — is the topic of this guide.

How regex engines actually run

Most regex engines (JavaScript, PCRE, Python re, Java, .NET, Ruby) are NFA backtracking. The engine tries to match the pattern by walking through the input and the pattern simultaneously, and when a match fails it backtracks — reverts the cursor and tries a different way to match.

Most of the time, backtracking is invisible and fast. But for certain pattern shapes, the number of ways to backtrack grows exponentially or polynomially with the length of the input. The engine has to enumerate them all before concluding the match has failed. That’s catastrophic backtracking.

The classic shape: nested quantifiers

The textbook example is (a+)+ against the input aaaaaaaaaaaaaaaa!:

/(a+)+/.test("aaaaaaaaaaaaaaaa!")

There’s no ! in the pattern, so the match will eventually fail. But before it fails, the engine tries every possible way to split the 16 as across the inner and outer quantifier. With 16 characters, that’s 2¹⁶ = 65,536 ways. With 30 characters, it’s a billion. With 40, a trillion. Your CPU is now the attacker’s.

The shapes to recognize

Three patterns reliably produce catastrophic backtracking:

1. Nested quantifiers:

(a+)+        (a*)*        (a+)*        ([a-z]+)+

2. Alternation with overlap, under a quantifier:

(a|a)+       (a|ab)+      (\w|\d)+    (.|\S)*

When the two alternatives can match the same character, the engine has multiple ways to consume the same input — and re-checks each one on failure.

3. Adjacent greedy quantifiers:

.*.*         a*a*         \w+\w+

When two greedy quantifiers can both consume the same characters, the engine tries every possible split between them.

Polynomial vs exponential — both bad

Not all backtracking is exponential. Polynomial backtrackinggrows as O(n²) or O(n³) — slower than exponential but still catastrophic in production. The difference matters when you’re estimating impact:

  • O(n²) — a 10KB malicious payload takes ~100ms; a 100KB payload takes ~10 seconds. Bad, but might survive request timeouts.
  • O(n³) — 100KB takes ~16 minutes. Will definitely take down the worker.
  • O(2ⁿ) exponential — 30 characters takes a billion operations. The Cloudflare 2019 incident was a polynomial pattern hit with an unusually long input.

How to recognize vulnerable patterns at a glance

After a while you develop a kind of allergy. Three signals to look for:

  • Quantifier on a group whose body also has a quantifier. (a+)+ — the nested pattern is the textbook foot-gun.
  • Alternation under a quantifier where the alternatives overlap. (\\w|\\d)+ — every digit matches both alternatives.
  • Two greedy quantifiers separated by nothing or a thin literal. .*X.* in a context where X is optional.

Our Diagnose tab in the toolkit statically scans for all three.

The four fixes

Most vulnerable patterns can be rewritten safely.

1. Switch to a linear-time engine

If your service runs in Go or Rust, you already get RE2 / regexcrate — immune to ReDoS by design. The cost is no lookaround and no backreferences. For 90% of patterns that’s acceptable. If you’re in JavaScript, libraries like rresafe-regex or safe-regex let you reject patterns statically.

2. Anchor your patterns

Many catastrophic shapes only blow up when the engine retries from every position. Pinning the start with ^ often eliminates the retry loop:

(a+)+       <-- vulnerable
^(a+)+$     <-- still backtracks within the line
^a+$        <-- linear time, equivalent on simple patterns

Anchoring isn’t a complete fix — it can hide the problem in single-line input while leaving it exposed in multiline. But it’s often the cheapest improvement.

3. Atomic groups and possessive quantifiers

PCRE, Java, and .NET let you mark a section as “don’t backtrack into this” — atomic groups (?>...) and possessive quantifiers a++, a*+, a?+. Once the engine matches an atomic section, it commits — failed matches won’t cause it to re-split that section.

(a+)+        <-- catastrophic
(?>a+)+      <-- atomic — safe
(a++)+       <-- possessive — same idea, different syntax

JavaScript doesn’t have either. The closest substitute is rewriting to remove the nested quantifier.

4. Rewrite to remove the nested quantifier

Usually the cleanest fix. If you want “one or more groups of characters separated by hyphens,” don’t write (\\w+-?)+; write \\w+(-\\w+)*. The result is the same shape but with no nesting.

Real-world examples and rewrites

Email validator (insidiously vulnerable)

/^[\w-]+(\.[\w-]+)*@[\w-]+(\.[\w-]+)+$/

The (\\.[\\w-]+)* against an input like aaa@a.aaaaaaa.aaaaaaa.aaaaa with no final TLD takes seconds. The fix is either to anchor more tightly or to use the permissive pattern from the library which doesn’t have a nested quantifier on the domain.

Cloudflare’s 2019 pattern

(?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|\-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*)))

Notice the .* followed by (?:.*=.*) at the end — adjacent greedy quantifiers, classic O(n²) backtracking. On a long input without an = sign, every position is re-tried as a potential start of the equals section. The fix in the post-mortem was to anchor and constrain.

Static analysis tools

If you ship regexes into production, run static analysis:

  • vuln-regex-detector — static analyzer that emits crafted inputs designed to trigger backtracking.
  • recheck — fuzzes patterns against the live engine, available as a GitHub Action.
  • RegexStaticAnalysis — academic tool that distinguishes exponential from polynomial backtracking, useful for understanding what kind of vulnerability you have.
  • tooled.dev Diagnose tab — fast heuristic check covering the most common shapes. Not a full proof of safety but catches the textbook cases.

Defense in depth — what to do besides fixing the regex

Even with good patterns, defense in depth matters:

  • Wall-clock timeout on every regex execution. In Node.js, run regex on user input inside a worker thread with a 100ms cap. Python re doesn’t support per-call timeout natively; use the regex module which does.
  • Input length cap. If your regex runs on input that should never be more than 1KB, reject inputs > 1KB before they hit the regex. Cheap and effective.
  • Move the regex out of the hot path. For non-real-time tasks (log processing, scheduled scans), run the regex in a background worker with a watchdog that kills it if it exceeds budget.
  • Switch engines for high-risk patterns. If a particular regex runs on attacker-controlled input and you can’t prove it’s safe, compile it with a linear-time engine (RE2 via Node FFI, or move that endpoint to a Go/Rust service).

FAQ

What is ReDoS?

Regular Expression Denial of Service — a class of attack where a maliciously-crafted input causes a vulnerable regex to consume exponential or polynomial CPU time. The classic shape is `(a+)+` against `aaaaaa...!`, where the engine tries every possible way to split the `a`s before giving up.

Which engines are vulnerable?

Any NFA-based backtracking engine: JavaScript, PCRE, Python `re`, Java, .NET, Ruby Onigmo. Linear-time DFA engines (Go RE2, Rust `regex`, hyperscan) are immune by construction — they don't backtrack.

Has ReDoS caused real outages?

Yes. Cloudflare's global 27-minute outage in July 2019 was triggered by a vulnerable regex in a WAF rule. Stack Overflow had a 30-minute outage in 2016 from a similar bug. ReDoS shows up in CVE databases regularly — `npm audit` warnings often relate to it.

How do I test for ReDoS?

Static analyzers (`vuln-regex-detector`, `recheck`, `RegexStaticAnalysis`) emit crafted inputs designed to expose vulnerable shapes. Dynamic testing means running your regex against pathological inputs (like 30 `a`s followed by `!`) with a wall-clock timeout — if it exceeds 100ms on a tiny input, you have a bug.

Why does my regex run fine locally but timeout in production?

Two common reasons: (1) production sees attacker-crafted inputs, locally you tested with normal ones; (2) production load amplifies — a 50ms request becomes 5000ms when contended. ReDoS in a request handler scales worse than linearly because each blocked request holds a worker thread.

Audit your patterns

Three places to look first if you suspect a ReDoS issue:

  1. Anywhere a regex runs against user-submitted text — form validators, search bars, comment filters.
  2. WAF / firewall rules — especially patterns that match HTTP request bodies.
  3. Anywhere you imported a regex from Stack Overflow or a tutorial without testing against pathological input.

Run each candidate through the Diagnose tab. It surfaces the same shapes we covered here with severity ratings and fix hints. For the highest-stakes patterns, pair the heuristic check with a fuzz run via vuln-regex-detector before shipping.

Up next, if you haven’t already: the working developer’s introduction or lookarounds explained.