the_silver_searcher: ERR: expected to read 4043309056 bytes but read 4294967295

It seems that that ag fails due to malloc with a really weird error when it encounters big files. The issue should be easy to replicate, just assure you have a DVD ISO image in the search path.

ag version 1.0.2

Features:
  +jit +lzma +zlib

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 32
  • Comments: 19

Commits related to this issue

Most upvoted comments

Reading the source this problem only appears to show up whenever --no-mmap is set, which is the default on OSX. Running with --mmap ‘fixes’ the issue.

Also the error is a bit misleading - the code looks like:

off_t f_len = 0;
size_t bytes_read = read(fd, buf, f_len);
if ((off_t)bytes_read != f_len) {
    die("expected to read %u bytes but read %u", f_len, bytes_read);
}

%u implies an int which maxes out at 4 bytes wide, but off_t and size_t (and ssize_t) are all 8 bytes wide on modern OSXen.

@ssbarnea I disagree with the 1%. Searching through big files (eg. dump, logs and stuff) is more frequent than you might think.

Plus, the original issue is that ag fails when dealing with big files.

Having to use grep instead shows something can be done to make ag works.

Still getting this, --mmap cured it but really wish I didn’t have to…

~/Projects/data-refinery $ a RSEM
ERR: expected to read 3403130363 bytes but read 4294967295
~/Projects/data-refinery $ ag -V
ag version 2.1.0

Features:
  +jit +lzma +zlib
~/Projects/data-refinery $ ag RSEM --mmap
workers/data_refinery_workers/processors/transcriptome_index.py
155:    # RSEM takes a prefix path and then all files generated by it will
247:      * Next the tool RSEM's prepare-reference is run.

I don’t care that ag couldn’t scan the huge file, but I do mind that because of that one file, I didn’t get any results at all. Handle the exception better and scan the rest!

I’m seeing similar behavior on version 1.0.2 installed via Brew. The issue went away when I moved a 4.4GB file out of the search folder.

Running with the -mmap flag fixed it for me, but that’s kind of a workaround?

Incidentally I also disagree with the 1% assertion above. I want to know I’m searching ALL the files in the directory tree I’m searching. At the very least there should be a message telling me what files are being skipped. Completely frustrating to think you’ve check exhaustively only to discover later that a big file had what you were looking for…

I’m seeing the same thing, and can confirm that this does not happen when I move large files out of the search folder.

ag version 1.0.2

Features:
  +jit +lzma +zlib