toybox: find: infinite loop when performing large amount of inplace file operations on btrfs

Background

I was building the Linux kernel in a shell environment (in fact the AOSP build system) where most utils are provided by toybox rather than GNU ones. The build infinitely stuck at a postprocessing step: https://github.com/torvalds/linux/blob/a7904a538933c525096ca2ccde1e60d0ee62c08e/kernel/gen_kheaders.sh#L80 After some investigation, I found that it’s relevant to toybox’s find utils.

MWE

Reproducible on a btrfs partition, but worked fine on ext4 in my experiments.

mkdir testdir && cd testdir
# only happens when the directory contains a large amount of files
# 500 was sufficient to repro on my Arch PC, but need to increase to 3000 on another Debian server
for i in {1..500}
    touch $(openssl rand -hex 12)  # generate empty test files with random name
cd ..
./toybox-x86_64 find testdir -type f -print0 | xargs -0 -n1 sed -i s/a/b/  # it hangs forever after this

Replacing ./toybox-x86_64 find with GNU findutils fixed the issue.

Environment

  • Toybox 0.8.6 downloaded from http://landley.net/toybox/bin
  • Arch Linux with 5.15.10-zen1-1-zen kernel (500 files)
  • Debian 11 with 5.10.0-9-amd64 kernel (3000 files)

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 35 (15 by maintainers)

Most upvoted comments

Filipe Manana submitted a kernel fix https://lore.kernel.org/linux-btrfs/c9ceb0e15d92d0634600603b38965d9b6d986b6d.1691923900.git.fdmanana@suse.com/ and I tested it against 6.4 (it applied with offset and worked fine).

Alas spinics.net is giving me an “ERR_ADDRESS_UNREACHABLE” right now but if it works for you, checking the replies to the above link will probably find it, including my mkroot test procedure.

When a kernel comes out with this in it, remind me to close this.

I’ve been cc’d on the emails about it. 😃

Merged into the btrfs maintainer’s tree, we just missed the -rc6 pull but there might be an -rc7 pull before the release: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git/log/?h=for-next

(No, I don’t know why it’s in the log twice.)

By the way, the easy workaround is probably something like (untested):

find . -print0 > file.idx xargs -0 -n1 bash -c ’ mv “$1” tmp; mv tmp “$1” ’ < file.idx rm file.idx

I could do buffering inside find to workaround btrfs, but you can just do the buffering yourself.

(Still might be worth sending the test case to the btrfs devs to get them to explicitly say “we don’t care what this breaks, it’s on you to work around our behavior” on record and all. I’ve done some head scratching about “maybe -newer would… no, rename doesn’t change the timestamp” and so on, but a directory traversal in btrfs is never guaranteed to end, and that’s kinda creepy. I wonder what kind of denial of service attacks you could build around that in other programs…)

not reproducible, but appears to be kernel bug rather than toybox issue.

Weird…Reproduced on 3 different machines w/ distinct OS config for Arch w/ 5.15 kernel: http://fars.ee/2aMv config for Ubuntu 20.04 w/ 5.11 kernel: http://fars.ee/-XeR config for Debian 11 w/ 5.10 kernel: https://termbin.com/k89ov

Thanks for the response!

Make sure find is chained with inplace file operations… Your example is reproducible on my Arch PC (5.15.10 kernel) with the following modifications:

  1. increase 500 to 2000
  2. add a simple inplace operation such as sed: ~/toybox-x86_64 find sub -type f -print0 | xargs -0 -n1 sed -i s/a/b/. The one-line perl script from Linux build system can be adopted as well.

p.s. the btrfs .img was stored on ext4 partition in my test

The issue is irrelevant to random file names, openssl was just used to generate unique filenames.

I was actually considered this as a btrfs bug, but I’m not sure and I don’t know how to describe it when reporting to upstream since I’m not aware of the internal implementation of find