Skip to main content

Log Analysis

While rg is optimized for codebases, it is also extremely effective for structured and semi-structured log parsing.

Counting Error Frequency

# How many errors per file?
rg -c "ERROR" /var/log/

# Total errors across all logs
rg -c "ERROR" /var/log/ | awk -F: '{sum += $2} END {print sum}'

Filtering by Severity

# Only FATAL and ERROR lines
rg "FATAL|ERROR" app.log

# Everything EXCEPT debug
rg -v "DEBUG" app.log

# Just WARNING and above in the last 1000 lines
tail -1000 app.log | rg "WARNING|ERROR|FATAL"

Extracting IP Addresses (-o)

# All unique IPs in Nginx access log
rg -o "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" access.log | sort -u

# Top 10 IPs by request count
rg -o "^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}" access.log | sort | uniq -c | sort -nr | head -10

Extracting HTTP Status Codes

# Count each status code
rg -o " [0-9]{3} " access.log | sort | uniq -c | sort -nr

Parsing JSON Logs

Many applications output newline-delimited JSON (NDJSON). Use rg to pre-filter lines, then jq to parse:

# Filter to error events, then extract the message field
rg '"level":"error"' app.log | jq -r '.message'

# Find all unique error codes
rg '"status":[45][0-9]{2}' events.log | jq '.status' | sort -u

Time-Range Filtering

Most logs include timestamps. Use rg to scope searches to a specific hour:

# Only errors between 14:00 and 15:00
rg "2024-01-15 14:[0-9]{2}:[0-9]{2}.*ERROR" app.log

# Only requests on a specific date
rg "15/Jan/2024" access.log | rg "404"

Live Log Monitoring

rg cannot stream (it reads complete files). Use grep for live tailing:

# Correct: grep for live streams
tail -f /var/log/app.log | grep --line-buffered "ERROR"

# rg for batch analysis of a completed log
rg "ERROR" /var/log/app.log.1