Skip to main content

Parallel Search

The architecture that makes rg fast is multi-threaded directory traversal and searching.

How It Works Internally

When you run rg "pattern" /var/www:

  1. A director traversal thread walks the directory tree and pushes files onto a shared work queue.
  2. A pool of N search threads pulls files from the queue and searches them concurrently.
  3. Results are collected and output in a streaming fashion.

The default number of threads equals the number of logical CPU cores (as reported by nproc).

Checking Thread Behavior

# See thread count being used
rg --threads 1 "pattern" . # single-threaded (baseline)
rg --threads 4 "pattern" . # 4 threads
rg "pattern" . # default (all cores)

When Fewer Threads Is Better

On spinning HDDs: Multiple threads seeking different parts of the disk simultaneously causes thrashing. A single thread with sequential reads is faster.

# Better for HDDs
rg --threads 1 --mmap "pattern" /mnt/hdd-mount/

On production servers: If rg is part of a monitoring script running every 60 seconds, limiting threads prevents it from starving web server worker processes.

# Throttled search for cron jobs
rg --threads 2 -c "ERROR" /var/log/ >> /var/log/error-summary.log

Sorting Output

Because rg is multi-threaded, the output order of files is non-deterministic. If you need reproducible output (e.g. for diffing or CI), sort it:

# Sort results by filename
rg "TODO" --sort path

# Sort by last modified time
rg "TODO" --sort modified

# Available sort keys: path, modified, accessed, created

Measuring Speed

Benchmark your own searches to see if threading helps:

# Baseline: no threading
time rg --threads 1 "pattern" /large/codebase/

# Full parallelism
time rg "pattern" /large/codebase/