bitcoin: fuzz: how to deal with fuzz input invalidation?

Existing fuzz seeds are invalidated (i.e. may yield less code coverage) whenever the fuzz input “format” changes. Generally, this can happen whenever the behaviour of any executed code changes.

It would be nonsensical to hold back on changing Bitcoin Core code (e.g. validation code) to not invalidate fuzz inputs. Thus, we need to run a fuzz engine 24/7 to adapt the fuzz inputs to the latest code.

While it is possible to avoid some seed invalidations when fuzz code (or other test code) is changed, I think, given that we already have to anticipate input invalidation, we don’t need to spend resources to not invalidate fuzz inputs when fuzz code (or other test code) changes.

About this issue

Original URL
State: closed
Created 3 years ago
Comments: 15 (15 by maintainers)

Most upvoted comments

In other words: instead of invalidating every time we add a new case we’ll be invalidating every some-power-of-2:th time we add a new case 😃

I looked into something like this, but concluded it was not worth it. While it is cheap to implement, adding new cases is rarer than removing existing cases. And removing existing cases would still break the inputs with this method, just like before.

maflcko on Aug 25, 2021

all remaining 150 fuzz targets can reach the current coverage with just minutes of CPU time.

I’m not sure whether that’s awesome or a sign our fuzzing doesn’t go deep enough 😃

For reference, with targets having increased in complexity (rpc, bannman, …) this is no longer true.

maflcko on Jul 31, 2021

As suggested by @MarcoFalke in #20828 the following comment I wrote in https://github.com/bitcoin/bitcoin/pull/20828#issuecomment-759634453 about avoiding unnecessary invalidation might be of interest in this thread:

When using the switch (fuzzed_data_provider.ConsumeIntegralInRange<int>(0, N)) idiom it is worth noting that every time N is increased by one to accommodate for a new case all existing seeds are invalidated.

Example (all using the same fixed seed input in the form of a finite scream AAAAAAAAAAAAAAAAAAAAAAAAA):

FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 1) == 1
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 2) == 2
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 3) == 1
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 4) == 0
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 5) == 5
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 6) == 2
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 7) == 1
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 8) == 2
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 9) == 5
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 10) == 10
FuzzedDataProvider{buffer.data(), buffer.size()}.ConsumeIntegralInRange<int>(0, 11) == 5

One trick that can be used to tackle this is to choose from a larger range such as [0, 32] even if we only have say 12 case😒 ([0, 11]). The non-matching numbers [12, 32] will simply be “ignored” by the coverage-guided fuzzer.

That way we’ll only have to invalidate existing seeds when we’ve exhausted the entire range [0, 32] with matching case😒. Then we can bump to [0, 64], and so on.

In other words: instead of invalidating every time we add a new case we’ll be invalidating every some-power-of-2:th time we add a new case 😃

practicalswift on Jan 13, 2021

Like I said, I’m only suggesting avoiding invalidation when it’s basically trivial. Don’t really want to get bogged down in hypotheticals…

Fair enough. Though, those trivial cases should be rare enough to not yield any benefit given the overhead to having to remember when and how to invalidate them at a later point in time. Except for script_flags and process_messages, all remaining 150 fuzz targets can reach the current coverage with just minutes of CPU time. (Citation/Benchmark needed)

At present, if a PR invalidates …

Sounds like a good idea. I was thinking about passing -max_total_time to the fuzz engine even when run in ci-mode, but all fuzz targets take a different time, so it would be nicer if there was a -additional_fuzz_time_after_init option. The generated corpus would be lost as soon as the ci box cycles, but that doesn’t hurt because the coverage will quickly be re-generated by our fuzz farms once the pull is merged.

Maybe it is good enough to just set -max_total_time=30 and thus skip generation if the existing seeds take longer than that to read.

maflcko on Jan 13, 2021

A number of long term contributors are missing basic understanding of fuzzing currently.

Running the fuzzers should be straightforward: https://github.com/bitcoin/bitcoin/blob/master/doc/fuzzing.md . Also, the concept of fuzzing isn’t really that hard. However, the Bitcoin Core code base isn’t “optimized” for fuzzing (e.g. globals, background threads, non-mockable disk access, net processing dealing also with deserialization …), so working on fuzzers for Bitcoin Core sometimes becomes tricky. For general guidelines on writing fuzzers, you can take a look at https://github.com/google/fuzzing/blob/master/docs/intro-to-fuzzing.md

The fuzz seeds are cheap to produce unless they have been carefully chosen.

Correct. I think we should aim to write fuzzers that can explore the search space from scratch in a reasonable amount of time. Like mentioned before, this isn’t the case for all Bitcoin Core fuzz targets. There are plenty of hashes, checksums, signatures, POW checks, serialization specifics … etc that make it harder to find good fuzz inputs. Code coverage reports help in finding weak spots, and then specifically crafted seeds (either manually or extracted from real world data like the block files or socket buffers) can help to increase coverage.

It is a pity if those seeds get lost due to invalidation, and I think our best bet is to rely on the fuzz engine to translate them for us if it is possible. For exmaple the breakage caused by commit fa0f415 should be fixable with some trivial bit flips or similar.

maflcko on Jan 3, 2021