openssl: Bug (crash) when stopping provider

Detailed story is in this issue, though I’d recommend starting here.

TL;DR OpenSSL crashes after running liboqs tests, hitting NULL ptr when cleaning up/deallocating things. Great analysis is here. It occurs when several “additional” providers are defined in openssl.cnf - by “additional” I mean pkcs11-provider and oqs-provider, though legacy provider seems to also contribute to this problem when it’s defined (I stopped enabling it, precisely because of that).

Quoting from the referring issue:

With both pkcs11-provider and oqs-provider enabled - liboqs tests will pass (reporting == 464 passed, 220 skipped in 40.80s ==), but crash in the end:

Segmentation fault
FAILED: tests/CMakeFiles/run_tests /Users/ur20980/src/liboqs/build/tests/CMakeFiles/run_tests 
cd /Users/ur20980/src/liboqs && /opt/local/bin/cmake -E env OQS_BUILD_DIR=/Users/ur20980/src/liboqs/build python3 -m pytest --verbose --numprocesses=auto --ignore=scripts/copy_from_upstream/repos
ninja: build stopped: subcommand failed.

and crash report:

Termination Reason:    Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process:   exc handler [18302]

VM Region Info: 0 is not in any region.  Bytes before following region: 4310319104
      REGION TYPE                    START - END         [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      UNUSED SPACE AT START
--->  
      __TEXT                      100ea4000-100ea8000    [   16K] r-x/r-x SM=COW  .../MacOS/Python

Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_pthread.dylib       	       0x1a1c948c4 pthread_rwlock_wrlock + 0
1   libcrypto.3.dylib             	       0x10564add4 CRYPTO_THREAD_write_lock + 12 (threads_pthread.c:110)
2   libcrypto.3.dylib             	       0x1055f2c14 ERR_unload_strings + 92 (err.c:314)
3   libcrypto.3.dylib             	       0x105647720 ossl_provider_free + 92 (provider_core.c:688)
4   libcrypto.3.dylib             	       0x1056ad138 OPENSSL_sk_pop_free + 76 (stack.c:439)
5   libcrypto.3.dylib             	       0x105647104 sk_OSSL_PROVIDER_pop_free + 12 (provider_core.c:199) [inlined]
6   libcrypto.3.dylib             	       0x105647104 ossl_provider_store_free + 76 (provider_core.c:295)
7   libcrypto.3.dylib             	       0x105637d6c context_deinit_objs + 124 (context.c:250)
8   libcrypto.3.dylib             	       0x105637564 context_deinit + 16 (context.c:334) [inlined]
9   libcrypto.3.dylib             	       0x105637564 OSSL_LIB_CTX_free + 132 (context.c:465)
10  oqsprovider.0.5.0-dev.dylib   	       0x10436f560 oqsx_freeprovctx + 24 (oqsprov_keys.c:178)
11  oqsprovider.0.5.0-dev.dylib   	       0x10436e9a0 oqsprovider_teardown + 12 (oqsprov.c:553)
12  libcrypto.3.dylib             	       0x104547400 ossl_provider_free + 76
13  libcrypto.3.dylib             	       0x10459b5a0 OPENSSL_sk_pop_free + 60
14  libcrypto.3.dylib             	       0x104546ec0 ossl_provider_store_free + 72
15  libcrypto.3.dylib             	       0x10453b5c4 context_deinit_objs + 124
16  libcrypto.3.dylib             	       0x10453ae5c context_deinit + 32
17  libcrypto.3.dylib             	       0x10453ae2c ossl_lib_ctx_default_deinit + 20
18  libcrypto.3.dylib             	       0x10453dd80 OPENSSL_cleanup + 204
19  libsystem_c.dylib             	       0x1a1b55ed4 __cxa_finalize_ranges + 492
20  libsystem_c.dylib             	       0x1a1b55c4c exit + 44
21  libdyld.dylib                 	       0x1a1cb0554 dyld4::LibSystemHelpers::exit(int) const + 20
22  dyld                          	       0x1a193ff7c start + 2320

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

I find this improbable: How should oqsprovider “pick up” something that it does not load? (At least on Unix)

@baentsch - isn’t oqsprovider linked against libcrypto? The output shown in this comment suggests that is is:

https://github.com/openssl/openssl/issues/21023#issuecomment-1559479354

And it is making calls to libcrypto API functions. So the dynamic linker will “pick up” libcrypto when it loads the oqsprovider.

So, my assumption is that the Python is picking up the 3.1.0 libcrypto version (no debug symbols) and then loading the oqsprovider which is picking up the 3.2.0-dev libcrypto version. Chaos ensues.