The Regex Engine
rg ships with two regex engines:
| Engine | Flag | Supports | Limitation |
|---|---|---|---|
Rust regex | (default) | Most ERE + Unicode | No lookaheads/lookbehinds |
| PCRE2 | -P / --pcre2 | Full PCRE2 + lookarounds | Requires build feature |
The Default Rust regex Engine
The default engine is very fast due to two properties:
- Linear time: It never backtracks exponentially, so it cannot be triggered into "ReDoS" (Regular Expression Denial of Service).
- Literal extraction: It detects literal substrings in your pattern and uses SIMD hardware instructions to skip non-matching file sections at memory speed.
Supported Features (Default Engine)
# Character classes
rg "[0-9]+" # digits
rg "[[:alpha:]]+" # POSIX alpha
rg "\w+" # word chars (Rust: letters, digits, _)
# Anchors
rg "^ERROR" # line starts with ERROR
rg "\.log$" # line ends with .log
# Quantifiers
rg "fo{2,4}bar" # 2–4 "o"s between fo and bar
# Unicode
rg "\p{L}+" # any Unicode letter
rg "\p{Cyrillic}" # Cyrillic characters
PCRE2 Mode (-P)
Enable PCRE2 for lookaheads, lookbehinds, and atomic groups:
# Find lines where "error" is NOT preceded by "no " (negative lookbehind)
rg -P "(?<!no )error" app.log
# Find IP addresses using PCRE2 word boundaries
rg -P "\b(?:\d{1,3}\.){3}\d{1,3}\b" access.log
# Extract only the value after "user_id=", using lookbehind
rg -P -o "(?<=user_id=)\d+" events.log
Unicode Awareness
By default rg is fully Unicode-aware. \w matches Unicode word characters, . does not match newlines but does match multi-byte Unicode codepoints.
# Disable Unicode for raw byte matching (faster on ASCII-only logs)
rg --no-unicode "pattern" large_ascii.log