bitcoin: fuzz: how to deal with fuzz input invalidation?
Existing fuzz seeds are invalidated (i.e. may yield less code coverage) whenever the fuzz input “format” changes. Generally, this can happen whenever the behaviour of any executed code changes.
It would be nonsensical to hold back on changing Bitcoin Core code (e.g. validation code) to not invalidate fuzz inputs. Thus, we need to run a fuzz engine 24/7 to adapt the fuzz inputs to the latest code.
While it is possible to avoid some seed invalidations when fuzz code (or other test code) is changed, I think, given that we already have to anticipate input invalidation, we don’t need to spend resources to not invalidate fuzz inputs when fuzz code (or other test code) changes.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (15 by maintainers)
I looked into something like this, but concluded it was not worth it. While it is cheap to implement, adding new cases is rarer than removing existing cases. And removing existing cases would still break the inputs with this method, just like before.
For reference, with targets having increased in complexity (
rpc
,bannman
, …) this is no longer true.As suggested by @MarcoFalke in #20828 the following comment I wrote in https://github.com/bitcoin/bitcoin/pull/20828#issuecomment-759634453 about avoiding unnecessary invalidation might be of interest in this thread:
When using the
switch (fuzzed_data_provider.ConsumeIntegralInRange<int>(0, N))
idiom it is worth noting that every timeN
is increased by one to accommodate for a newcase
all existing seeds are invalidated.Example (all using the same fixed seed input in the form of a finite scream
AAAAAAAAAAAAAAAAAAAAAAAAA
):One trick that can be used to tackle this is to choose from a larger range such as
[0, 32]
even if we only have say 12case
😒 ([0, 11]
). The non-matching numbers[12, 32]
will simply be “ignored” by the coverage-guided fuzzer.That way we’ll only have to invalidate existing seeds when we’ve exhausted the entire range
[0, 32]
with matchingcase
😒. Then we can bump to[0, 64]
, and so on.In other words: instead of invalidating every time we add a new
case
we’ll be invalidating every some-power-of-2:th time we add a newcase
😃Fair enough. Though, those trivial cases should be rare enough to not yield any benefit given the overhead to having to remember when and how to invalidate them at a later point in time. Except for
script_flags
andprocess_messages
, all remaining 150 fuzz targets can reach the current coverage with just minutes of CPU time. (Citation/Benchmark needed)Sounds like a good idea. I was thinking about passing -max_total_time to the fuzz engine even when run in ci-mode, but all fuzz targets take a different time, so it would be nicer if there was a -additional_fuzz_time_after_init option. The generated corpus would be lost as soon as the ci box cycles, but that doesn’t hurt because the coverage will quickly be re-generated by our fuzz farms once the pull is merged.
Maybe it is good enough to just set
-max_total_time=30
and thus skip generation if the existing seeds take longer than that to read.Running the fuzzers should be straightforward: https://github.com/bitcoin/bitcoin/blob/master/doc/fuzzing.md . Also, the concept of fuzzing isn’t really that hard. However, the Bitcoin Core code base isn’t “optimized” for fuzzing (e.g. globals, background threads, non-mockable disk access, net processing dealing also with deserialization …), so working on fuzzers for Bitcoin Core sometimes becomes tricky. For general guidelines on writing fuzzers, you can take a look at https://github.com/google/fuzzing/blob/master/docs/intro-to-fuzzing.md
Correct. I think we should aim to write fuzzers that can explore the search space from scratch in a reasonable amount of time. Like mentioned before, this isn’t the case for all Bitcoin Core fuzz targets. There are plenty of hashes, checksums, signatures, POW checks, serialization specifics … etc that make it harder to find good fuzz inputs. Code coverage reports help in finding weak spots, and then specifically crafted seeds (either manually or extracted from real world data like the block files or socket buffers) can help to increase coverage.
It is a pity if those seeds get lost due to invalidation, and I think our best bet is to rely on the fuzz engine to translate them for us if it is possible. For exmaple the breakage caused by commit fa0f415 should be fixable with some trivial bit flips or similar.