ripgrep: High CPU usage when piping into less?
This is a weird behaviour, and may not be ripgrep’s fault, but I thought I’d mention it in case.
I piped rg into less with rg cmd | less and looked at the first page of results, left them there and continued working. After a while I noticed that my CPU usage was high and rg was using 200% of my 400% CPU’s available. If I run the command without less and time it, it takes around 0.5 seconds, but with less, the CPU sits at 200% continually.
If I type G then less goes to the end of the window and the rg process exits.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 19 (9 by maintainers)
Thanks so much for reporting this and digging into it! It was a tricky one!
It turns out that “noodling” on this problem was exactly what was needed.
I ended up figuring out how to produce this. Basically, it happens whenever you:
stdoutto block).The issue here is that when one thread blocks on writing to
stdoutand there is no more work to be done, then the rest of the worker threads will spin on the multi-producer multi-consumer queue and therefore burn the CPU. (We don’t use a blocking queue because we don’t really know when directory traversal is finished.) I fixed this by inserting a very short sleep whenever a worker can’t get work from the queue. This still pegs the CPU a bit, but it is much much less than before.This also caused me to realized another shortcoming of the existing iterator. It was running searches for all direct children of a directory in the same thread. So if you ran ripgrep on a single directory with lots of files, you wouldn’t get any parallelism! Owch.
Here’s a slightly more complete trace from start to finish:
You can ignore the
dtrace: error on enabled probe ID 2134lines, as that is just from macOS System Integrity Protection.grep -r cmd ./ | lessdoesn’t exhibit that behaviour.