ripgrep: ripgrep much slower than (rip)grep
What version of ripgrep are you using?
ripgrep 11.0.1 -SIMD -AVX (compiled) +SIMD -AVX (runtime)
How did you install ripgrep?
pacman -S ripgrep
What operating system are you using ripgrep on?
Arch Linux
Describe your question, feature request, or bug.
ripgrep is 3-4-5 times slower than GNU grep on simple string pattern.
If this is a bug, what is the actual behavior?
grep file\.php -r . 0.19s user 0.10s system 98% cpu 0.298 total
rg (--no-ignore) file\.php 3.68s user 0.45s system 365% cpu 1.129 total
rg --debug --hidden --no-ignore file\.php 6.12s user 0.42s system 383% cpu 1.706 total
DEBUG|grep_regex::literal|grep-regex/src/literal.rs:115: required literal found: "file"
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:435: built glob set; 0 literals, 0 basenames, 11 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
As I tried to find one-by-one what top-level file (or directory) causes the slowdown, I came to the conclusion: none. Ripgrep should be faster… So I did this table: (Sorry, for hiding the names, I’m not sure if I’m allowed to show them.)
#!/bin/sh
$ echo GREPWHAT=$GREPWHAT RGFLAGS=$RGFLAGS GREPFLAGS=$GREPFLAGS; find -maxdepth 1 | xargs -I{} sh -c 'echo $(echo {} | tr a-z x):$(du -sh {} | cut -f1):$(find {} | wc -l):$( ( time rg "$GREPWHAT" $RGFLAGS {} &>/dev/null ) 2>&1 | tr \\t\\n \ ):$( ( time find -mindepth 1 -maxdepth 1 -not -path {} | xargs rg "$GREPWHAT" $RGFLAGS &>/dev/null ) 2>&1 | tr \\t\\n \ ):$( ( time grep "$GREPWHAT" $GREPFLAGS -r {} &>/dev/null ) 2>&1 | tr \\t\\n \ ):$( ( time find -mindepth 1 -maxdepth 1 -not -path {} | xargs grep "$GREPWHAT" $GREPFLAGS -r &>/dev/null ) 2>&1 | tr \\t\\n \ )' | column -ts: -Rsize,files,rgtime,rgwothistime,greptime,grepwothistime -Nname,size,files,rgtime,rgwothistime,greptime,grepwothistime
GREPWHAT=xxxxxxxxxxxxxxxxxxx.xxx RGFLAGS=-j1 GREPFLAGS=
name size files rgtime rgwothistime greptime grepwothistime
. 127M 12961 real 0m0.964s user 0m3.295s sys 0m0.348s real 0m0.971s user 0m3.285s sys 0m0.388s real 0m0.146s user 0m0.039s sys 0m0.100s real 0m0.148s user 0m0.069s sys 0m0.078s
./xx 4.0K 1 real 0m0.009s user 0m0.006s sys 0m0.003s real 0m0.919s user 0m3.097s sys 0m0.406s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.146s user 0m0.055s sys 0m0.089s
./_xxx_xxxxxx.xxx 428K 1 real 0m0.011s user 0m0.008s sys 0m0.004s real 0m0.936s user 0m3.123s sys 0m0.388s real 0m0.004s user 0m0.004s sys 0m0.000s real 0m0.144s user 0m0.043s sys 0m0.100s
./xxxxxx 89M 11741 real 0m0.217s user 0m0.576s sys 0m0.211s real 0m0.221s user 0m0.688s sys 0m0.109s real 0m0.106s user 0m0.036s sys 0m0.064s real 0m0.051s user 0m0.023s sys 0m0.029s
./xxxxx 16K 7 real 0m0.010s user 0m0.009s sys 0m0.004s real 0m0.942s user 0m3.101s sys 0m0.418s real 0m0.003s user 0m0.003s sys 0m0.000s real 0m0.147s user 0m0.049s sys 0m0.098s
./xxxxxxx 52K 28 real 0m0.014s user 0m0.015s sys 0m0.004s real 0m0.946s user 0m3.148s sys 0m0.418s real 0m0.003s user 0m0.003s sys 0m0.000s real 0m0.152s user 0m0.048s sys 0m0.099s
./xxxxxx.xxx 4.0K 1 real 0m0.008s user 0m0.006s sys 0m0.003s real 0m0.929s user 0m3.113s sys 0m0.400s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.143s user 0m0.051s sys 0m0.091s
./xxxxxx 20K 5 real 0m0.012s user 0m0.015s sys 0m0.000s real 0m0.926s user 0m3.114s sys 0m0.421s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.144s user 0m0.038s sys 0m0.105s
./xxxxxxxxx 248K 60 real 0m0.013s user 0m0.015s sys 0m0.005s real 0m0.939s user 0m3.139s sys 0m0.384s real 0m0.004s user 0m0.000s sys 0m0.004s real 0m0.151s user 0m0.056s sys 0m0.095s
./xxxxxx.xx 4.0K 1 real 0m0.006s user 0m0.000s sys 0m0.006s real 0m0.937s user 0m3.176s sys 0m0.413s real 0m0.004s user 0m0.004s sys 0m0.000s real 0m0.145s user 0m0.055s sys 0m0.089s
./xxxxxx 35M 522 real 0m0.068s user 0m0.138s sys 0m0.077s real 0m0.227s user 0m0.639s sys 0m0.211s real 0m0.036s user 0m0.025s sys 0m0.011s real 0m0.114s user 0m0.031s sys 0m0.082s
./xxxx.xxx 4.0K 1 real 0m0.006s user 0m0.006s sys 0m0.001s real 0m0.933s user 0m3.123s sys 0m0.408s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.150s user 0m0.045s sys 0m0.104s
./xxxxxxx.xxx 4.0K 1 real 0m0.007s user 0m0.001s sys 0m0.006s real 0m0.941s user 0m3.139s sys 0m0.409s real 0m0.003s user 0m0.003s sys 0m0.000s real 0m0.145s user 0m0.072s sys 0m0.072s
./xxxxx.xxx 4.0K 1 real 0m0.008s user 0m0.004s sys 0m0.004s real 0m0.957s user 0m3.127s sys 0m0.419s real 0m0.004s user 0m0.004s sys 0m0.000s real 0m0.154s user 0m0.048s sys 0m0.101s
./xxxxxxxx 276K 52 real 0m0.013s user 0m0.016s sys 0m0.000s real 0m1.037s user 0m2.956s sys 0m0.385s real 0m0.004s user 0m0.004s sys 0m0.000s real 0m0.147s user 0m0.036s sys 0m0.109s
./xxxxxx 128K 34 real 0m0.012s user 0m0.010s sys 0m0.005s real 0m0.934s user 0m3.142s sys 0m0.398s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.146s user 0m0.062s sys 0m0.083s
./xxxxxxxx.xxxx 212K 1 real 0m0.009s user 0m0.005s sys 0m0.004s real 0m0.924s user 0m3.123s sys 0m0.362s real 0m0.004s user 0m0.000s sys 0m0.003s real 0m0.147s user 0m0.063s sys 0m0.083s
./xxxxxxxx.xxxx 4.0K 1 real 0m0.008s user 0m0.003s sys 0m0.005s real 0m0.928s user 0m3.067s sys 0m0.422s real 0m0.004s user 0m0.004s sys 0m0.000s real 0m0.151s user 0m0.057s sys 0m0.088s
./xxxxxx 1.2M 294 real 0m0.018s user 0m0.027s sys 0m0.008s real 0m0.852s user 0m2.761s sys 0m0.476s real 0m0.007s user 0m0.000s sys 0m0.007s real 0m0.141s user 0m0.033s sys 0m0.108s
./xxxxxxxxx 12K 5 real 0m0.012s user 0m0.016s sys 0m0.000s real 0m0.987s user 0m2.797s sys 0m0.365s real 0m0.003s user 0m0.000s sys 0m0.003s real 0m0.148s user 0m0.038s sys 0m0.109s
./xxxxxxx 4.0K 1 real 0m0.005s user 0m0.000s sys 0m0.005s real 0m1.074s user 0m2.989s sys 0m0.386s real 0m0.005s user 0m0.004s sys 0m0.000s real 0m0.143s user 0m0.057s sys 0m0.085s
./xxx 764K 198 real 0m0.014s user 0m0.011s sys 0m0.012s real 0m0.970s user 0m2.792s sys 0m0.481s real 0m0.005s user 0m0.005s sys 0m0.000s real 0m0.201s user 0m0.065s sys 0m0.133s
./.xxxxxxxx.xxxx.xxx 44K 1 real 0m0.007s user 0m0.003s sys 0m0.003s real 0m1.393s user 0m2.965s sys 0m0.569s real 0m0.004s user 0m0.000s sys 0m0.003s real 0m0.205s user 0m0.079s sys 0m0.119s
./.xxxxxxxxx 4.0K 1 real 0m0.007s user 0m0.001s sys 0m0.006s real 0m1.140s user 0m2.975s sys 0m0.556s real 0m0.004s user 0m0.000s sys 0m0.004s real 0m0.156s user 0m0.051s sys 0m0.099s
./.xxxxxxxxxxxxx 4.0K 1 real 0m0.006s user 0m0.003s sys 0m0.003s real 0m0.954s user 0m3.055s sys 0m0.484s real 0m0.003s user 0m0.003s sys 0m0.000s real 0m0.149s user 0m0.070s sys 0m0.078s
./.xxx.xxxxxxx 4.0K 1 real 0m0.007s user 0m0.004s sys 0m0.003s real 0m0.956s user 0m3.130s sys 0m0.402s real 0m0.003s user 0m0.003s sys 0m0.000s real 0m0.149s user 0m0.068s sys 0m0.076s
$ # Note that the sum of ripgrep times doesn't match with . time.
$ # wothis means: grepping in the directory with that file excluded.
$ time tokei
-------------------------------------------------------------------------------
Language Files Lines Code Comments Blanks
-------------------------------------------------------------------------------
...
-------------------------------------------------------------------------------
Total 9256 1221713 827992 262436 131285
-------------------------------------------------------------------------------
real 0m1.061s
user 0m3.384s
sys 0m0.295s
$ # Note that tokei has similar running time.
$ stat -f . | head -3
File: "."
ID: 0 Namelen: 255 Type: tmpfs
Block size: 4096 Fundamental block size: 4096
$ # Fast as hell tmpfs. Does it count?
Tomorrow I will make a line length statistics if it counts, but I think it should be an issue around directory listing. Anyway, I would like to help, so tell me what to do/share. --debug didn’t help too much.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 38 (36 by maintainers)
Wonderful, thank you. I can finally repro it. This is a very interesting bug. To exaggerate it a bit, I copied your long-line file so that there was 120MB on a single line:
To establish a baseline and get my bearings, I just tried poking at that single file to see how ripgrep behaves compared to grep:
The first command is a bit deceptive, because ripgrep will open the file using memory maps, which avoids the whole buffering strategy completely. So ripgrep zips through it without needing to care about line terminators and such. My guess is that this probably added to some of your confusion when trying to get a smaller reproduction, but I forgot about it. Once we disable memory maps, we can see that ripgrep slows down a bit, but it’s still 2x faster than grep.
Another curiosity here is that the maximum memory used by grep is ~2x more than ripgrep. My guess here is that this is because grep is using a larger buffer size, but I’m not 100% on this.
However, things start to change once we try to search multiple files at once. Notably, this does not do directory traversal or use parallelism, so we rule that out as the cause:
The first two commands confirm that using memory maps “fixes” this problem, so it’s really starting to look like very long lines are the issue. Despite the fact that grep was 2x slower searching just
longline, it winds up being noticeably faster in this particular search. For completeness, we verify that2*isn’t causing any issues on its own:So that seems fine. Digging into this a bit more, running
perf recordwith the--call-graphoption enabled lets us track more precisely why things are being slow. Runninglooks fairly uninteresting and about what I’d expect. Most time is being spent in resizing the buffer allocation to accommodate the large line, zeroing the memory and running
memchron it for the actual search.However, changing the command to
reveals that 90% of the time is being spent in
__memmove_avx_unaligned_erms, and moreover, all of that time is coming from within ripgrep’sLineBuffer::rollmethod. Now that’s strange, but I think it’s also consistent with everything you’ve observed so far. In other words, we have two pieces that, on their own, are fine, but when combined, something perverse is happening.My instinct tells me that something is going wrong with reusing the allocation. In particular, if a file like
longlinegrows the allocation for the line buffer to be very large, and that same buffer is reused for much smaller files, then perhaps whenever those files need to roll the buffer, they wind up paying a much larger cost for shifting things around.Hmmm… yes… this is starting to make sense. In the
rollimplementation, we have this call:which effectively moves everything at
self.posand after it back to the beginning of the buffer. If this line buffer had grown very large—due to a file with very long lines—then this is going to do a ton of copying.But copying everything after
self.posis not necessary. All we need to do is copy anything that has been written afterself.posback to the beginning. This is exactly the rangeself.pos..self.end, where any bytes afterself.endare just treated as free space that the buffer can use on the next read. So, let’s try changing the code toAnd re-running the command
Voilà! To pop back and make sure everything works across the entire directory:
So I think that fixes it. 😃 I think this explains everything. Even your changes to directory traversal, and the otherwise inconsistent results you were seeing. Namely, if the file with a very long line is searched after most other files, then the performance impact will be small, since the large buffer will be used for a comparatively small number of files. So the only way to observe this is to be in a situation where searching the file with a long line occurs earlier in the search.
This was an excellent bug. Thank you so much for sticking with it and providing a reproduction that I could dig into. This is now fixed on master!
For grins, can you try searching for
qqqqqqqqqq? (I am trying to rule out pathological behavior in the literal searcher. ripgrep uses a different one from grep. Searching for a pattern with the same rare letter repeated should hopefully put these on a level playing field. Again, just a shot in the dark.)Right yeah. Those shouldn’t be happening though when running with
-uuu. If they are, then that’s a bug.Thanks for really digging into this! I’ll be quite curious to see what you find!
Yes, of course, but results have to taken with care, because with “complex” regexp queries ripgrep can easily beat grep even if it’s slowed down by this issue. But generally speaking yes, if I grep for regexp-less or queries with moderated amount of regexps I get similar times.
In the mean time, I created some simple projects to test
ignoreandwalkdircrates. Probably You do, but I didn’t think of it before, thatopenatandstaterrors come from nonexistent.ignoreand.git… etc files.UPDATE: And futex errors come from
ignorecrate parallel walking.UPDATE2: Walking the directory: 0.150s; parallel: highly varies between 0.125-0.170s. I will try making some changes in the parallel walk code of
ignoreto see if some changes.Oops, sorry. What was in my mind is that they don’t have to be printed, so they don’t have to be stored. Silly thing.