bitcoin: memcmp with constants that contain zero bytes are broken in GCC

It appears that there is a bug in certain GCC releases (in the version 9 and 10 series) where an optimization step breaks correctness of memcmp when at least one of the arguments is a compile-time constant array that contains at least one zero byte.

GCC bug is here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189. It was stumbled upon by @roconnor-blockstream in https://github.com/bitcoin-core/secp256k1/pull/822. It is being tracked for libsecp256k1 in https://github.com/bitcoin-core/secp256k1/issues/823.

I have verified that in some instances it also affects C++, and may even affect std::lexicographical_compare.

This may be relevant in some of our code (in particular, the CNetAddr IP range checking does comparisons with constants that contain zeroes, but perhaps more).

Solutions:

Build with -fno-builtin-memcmp, but we should measure performance impact.
Very carefully inspect the codebase for potential cases, and use a custom memcmp for those.

TODO:

Verify if compiler-generated memcmp calls may be affected as well

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 5
Comments: 19 (19 by maintainers)

Commits related to this issue

Disable GCC builtin memcmp for C contracts https://github.com/bitcoin/bitcoin/issues/20005 — committed to nervosnetwork/capsule by jjyr 4 years ago
Merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f55b875aa6... — committed to bitcoin/bitcoin by deleted user 3 years ago
Merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f55b875aa6... — committed to syscoin/syscoin by deleted user 3 years ago
Partial merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f5... — committed to UdjinM6/dash by deleted user 3 years ago
partial merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f5... — committed to UdjinM6/dash by deleted user 3 years ago
partial merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f5... — committed to UdjinM6/dash by deleted user 3 years ago
partial merge #21036: gitian: Bump descriptors to Focal for 22.0 2ecaf214331b506ebfac4f4922241744357d652b gitian: remove execstack workaround for ricv64 & powerpc64le (fanquake) 5baff2b31840bdbc465f5... — committed to UdjinM6/dash by deleted user 3 years ago
Merge bitcoin/bitcoin#23778: release: Guix 1.4.0 & GCC 10.3 84f9931cb44932751415f2ca48501ba01eed39a6 guix: use upstream python-requests (2.26.0) (fanquake) 187dc1ec0c867ffcf44f607bbb928909d86a81ca bu... — committed to bitcoin/bitcoin by fanquake 2 years ago
Merge bitcoin/bitcoin#23778: release: Guix 1.4.0 & GCC 10.3 84f9931cb44932751415f2ca48501ba01eed39a6 guix: use upstream python-requests (2.26.0) (fanquake) 187dc1ec0c867ffcf44f607bbb928909d86a81ca bu... — committed to syscoin/syscoin by fanquake 2 years ago
test: add test case for GCC bug 95189 This has been fixed in GCC 10.3.0 (which we use for releases) and later, however according to dependencies.md we still support building with GCC 8.1+. Add a tes... — committed to fanquake/bitcoin by fanquake 2 years ago
test: add test case for GCC bug 95189 This has been fixed in GCC 10.3.0 (which we use for releases) and later, however according to dependencies.md we still support building with GCC 8.1+. Add a tes... — committed to fanquake/bitcoin by fanquake 2 years ago
test: add test case for GCC bug 95189 This has been fixed in GCC 10.3.0 (which we use for releases) and later, however according to dependencies.md we still support building with GCC 8.1+. Add a tes... — committed to fanquake/bitcoin by fanquake 2 years ago
Merge bitcoin/bitcoin#23778: release: Guix 1.4.0 & GCC 10.3 84f9931cb44932751415f2ca48501ba01eed39a6 guix: use upstream python-requests (2.26.0) (fanquake) 187dc1ec0c867ffcf44f607bbb928909d86a81ca bu... — committed to PastaPastaPasta/dash by fanquake 2 years ago

Most upvoted comments

According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189#c33, this has been fixed in GCC 10.3.0, which we currently use to build releases. The relevant backported commit, on the releases/gcc-10 branch, is https://github.com/gcc-mirror/gcc/commit/e4c9aac98611f63847ef6c57916808d9a2d7abcb.

So the fix is in GCC master, 12, 11 and 10 branches. Currently we support GCC 8.1+.

fanquake on Aug 4, 2022

Here is the example of the bug appearing when using the > operator on 2 arrays: https://godbolt.org/z/j5en9d Here it appears with std::lexicographical_compare: https://godbolt.org/z/a6scYK

-fno-builtin-memcmp Doesn’t do anything for either of these (as they didn’t call memcmp)

elichai on Sep 24, 2020

Technically my claim was, from my very limited understanding, the bug can only change non-zero results from memcpy to zero results. Still, even in this case if the result of memcmp(x,y) is changed from a 1 to a 0, the value of memcmp(...) <= 0 will change from false to true.

roconnor-blockstream on Sep 23, 2020

I worked up a clang AST query that can be used to detect the test-case. I’m not sure if it’s perfect though as I don’t know exactly what gcc will screw up (copies of pointers, de-references, etc.). If this is useful, it’d be trivial to hook it up to a clang-tidy check, or even a compiler plugin to make it easier to check depends as well.

memcmp.cpp

(exact test-case taken from gcc bug report):

using size_t = decltype(sizeof(0));
int memcmp(const void *s1, const void *s2, size_t n);

static const float z[1] = {0};
int f(float x)
{
    return memcmp(&x, z, sizeof(x));
}

In action:

$ clang-query ../memcmp.cpp -- -std=c++17
clang-query> m callExpr(hasAnyArgument(ignoringImpCasts(declRefExpr(to(varDecl(hasType(qualType(constantArrayType(),isConstQualified())),hasDescendant(integerLiteral(equals(0)))))))),callee(functionDecl(hasName("memcmp"))))

Match #1:

/home/cory/dev/bitcoin-tidy-experiments/build/../memcmp.cpp:7:12: note: "root" binds here
    return memcmp(&x, z, sizeof(x));
           ^~~~~~~~~~~~~~~~~~~~~~~~
1 match.

The query:

callExpr(
    hasAnyArgument(ignoringImpCasts(
        declRefExpr(to(varDecl(
            hasType(qualType(
                constantArrayType(),
                isConstQualified()
            )),
            hasDescendant(integerLiteral(equals(0)))
        ))))
    ),
    callee(functionDecl(hasName("memcmp")))
)

also indentation

theuni on Aug 31, 2022

I was curious to see how much work it would be to work around this, so I hacked up (and vendored) my libstdc++ headers in order to add detection for the guilty functions, and started a branch of workarounds.

The guilty v9 headers (which call __builtin_memcmp themselves) are:

bits/char_traits.h
bits/stl_algobase.h

v10 adds:

array (I haven’t looked into this one yet)

The workarounds mostly involve creating comparators for containers to avoid the default lexicographical_compare and getting rid of some std::equal usage. Those are pretty straightforward. But then I discovered that many of the boost string parsing functions end up being guilty.

For example: boost::split(vDeploymentParams, strDeployment, boost::is_any_of(":")); ends up internally calling std::__equal<true>::equal().

And while this might be a good excuse to get rid of some of boost usage, I’m afraid that the boost libs themselves are going to be a problem. And I don’t think we want to be patching boost to fix this.

Even worse than boost, glibc and libstdc++ themselves are likely affected as well (for different reasons).

I can push up my partial work if anyone is interested, but it’s not looking like this is a good way forward.

theuni on Sep 30, 2020

This is a patch against the broken GCC versions that emits a warning when the bug could be triggered: https://gist.github.com/real-or-random/1548239059158370416dfdc6a866329a

When I compile the project with this patch, I don’t get any warning. That’s good news. But unfortunately, this is not perfeclty sound. It catches all known examples but it fails to catch the std::lexicographical_compare example in this thread… Maybe the reason is buried somewhere in this function but I don’t know. https://gcc.gnu.org/onlinedocs/gcc-10.2.0/libstdc++/api/a00596_source.html#l01592

real-or-random on Sep 25, 2020