Skip to main content

Encoding and Binary Issues

How rg Detects Binary Files

rg reads a small portion of each file. If it encounters a null byte (\0), it classifies the file as binary and skips it by default.

This causes false negatives in two cases:

  1. UTF-16 encoded files — Windows-origin files insert null bytes between ASCII characters
  2. Corrupted log files — a single accidental null byte marks the whole file binary

Forcing Text Mode (-a)

# Treat ALL files as text, regardless of byte content
rg -a "pattern" .

# Safe usage: scope with a glob first
rg -a -g "*.log" "pattern" /var/log/

Handling UTF-16 Files

# Convert first, then pipe to rg
iconv -f UTF-16 -t UTF-8 windows_log.txt | rg "error"

Specifying Encoding (--encoding)

# For legacy Latin-1 encoded logs
rg --encoding latin-1 "pattern" legacy_logs/

Searching Compressed Logs

rg does not natively decompress files. Use --pre to pipe through a decompressor:

# Search gzip logs directly
zcat /var/log/syslog.*.gz | rg "ERROR"

# Or use rg's --pre preprocessor
rg --pre zcat "ERROR" /var/log/syslog.1.gz