Skip to main content

The Regex Engine

rg ships with two regex engines:

Engine	Flag	Supports	Limitation
Rust `regex`	(default)	Most ERE + Unicode	No lookaheads/lookbehinds
PCRE2	`-P` / `--pcre2`	Full PCRE2 + lookarounds	Requires build feature

The Default Rust `regex` Engine

The default engine is very fast due to two properties:

Linear time: It never backtracks exponentially, so it cannot be triggered into "ReDoS" (Regular Expression Denial of Service).
Literal extraction: It detects literal substrings in your pattern and uses SIMD hardware instructions to skip non-matching file sections at memory speed.

Supported Features (Default Engine)

# Character classes
rg "[0-9]+"          # digits
rg "[[:alpha:]]+"    # POSIX alpha
rg "\w+"             # word chars (Rust: letters, digits, _)

# Anchors
rg "^ERROR"          # line starts with ERROR
rg "\.log$"          # line ends with .log

# Quantifiers
rg "fo{2,4}bar"      # 2–4 "o"s between fo and bar

# Unicode
rg "\p{L}+"          # any Unicode letter
rg "\p{Cyrillic}"    # Cyrillic characters

PCRE2 Mode (`-P`)

Enable PCRE2 for lookaheads, lookbehinds, and atomic groups:

# Find lines where "error" is NOT preceded by "no " (negative lookbehind)
rg -P "(?<!no )error" app.log

# Find IP addresses using PCRE2 word boundaries
rg -P "\b(?:\d{1,3}\.){3}\d{1,3}\b" access.log

# Extract only the value after "user_id=", using lookbehind
rg -P -o "(?<=user_id=)\d+" events.log

Unicode Awareness

By default rg is fully Unicode-aware. \w matches Unicode word characters, . does not match newlines but does match multi-byte Unicode codepoints.

# Disable Unicode for raw byte matching (faster on ASCII-only logs)
rg --no-unicode "pattern" large_ascii.log

The Default Rust regex Engine
- Supported Features (Default Engine)
PCRE2 Mode (-P)
Unicode Awareness