openssl: Bug (crash) when stopping provider
Detailed story is in this issue, though I’d recommend starting here.
TL;DR
OpenSSL crashes after running liboqs tests, hitting NULL ptr when cleaning up/deallocating things. Great analysis is here. It occurs when several “additional” providers are defined in openssl.cnf - by “additional” I mean pkcs11-provider and oqs-provider, though legacy provider seems to also contribute to this problem when it’s defined (I stopped enabling it, precisely because of that).
Quoting from the referring issue:
With both pkcs11-provider and oqs-provider enabled - liboqs tests will pass (reporting == 464 passed, 220 skipped in 40.80s ==), but crash in the end:
Segmentation fault
FAILED: tests/CMakeFiles/run_tests /Users/ur20980/src/liboqs/build/tests/CMakeFiles/run_tests
cd /Users/ur20980/src/liboqs && /opt/local/bin/cmake -E env OQS_BUILD_DIR=/Users/ur20980/src/liboqs/build python3 -m pytest --verbose --numprocesses=auto --ignore=scripts/copy_from_upstream/repos
ninja: build stopped: subcommand failed.
and crash report:
Termination Reason: Namespace SIGNAL, Code 11 Segmentation fault: 11
Terminating Process: exc handler [18302]
VM Region Info: 0 is not in any region. Bytes before following region: 4310319104
REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL
UNUSED SPACE AT START
--->
__TEXT 100ea4000-100ea8000 [ 16K] r-x/r-x SM=COW .../MacOS/Python
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_pthread.dylib 0x1a1c948c4 pthread_rwlock_wrlock + 0
1 libcrypto.3.dylib 0x10564add4 CRYPTO_THREAD_write_lock + 12 (threads_pthread.c:110)
2 libcrypto.3.dylib 0x1055f2c14 ERR_unload_strings + 92 (err.c:314)
3 libcrypto.3.dylib 0x105647720 ossl_provider_free + 92 (provider_core.c:688)
4 libcrypto.3.dylib 0x1056ad138 OPENSSL_sk_pop_free + 76 (stack.c:439)
5 libcrypto.3.dylib 0x105647104 sk_OSSL_PROVIDER_pop_free + 12 (provider_core.c:199) [inlined]
6 libcrypto.3.dylib 0x105647104 ossl_provider_store_free + 76 (provider_core.c:295)
7 libcrypto.3.dylib 0x105637d6c context_deinit_objs + 124 (context.c:250)
8 libcrypto.3.dylib 0x105637564 context_deinit + 16 (context.c:334) [inlined]
9 libcrypto.3.dylib 0x105637564 OSSL_LIB_CTX_free + 132 (context.c:465)
10 oqsprovider.0.5.0-dev.dylib 0x10436f560 oqsx_freeprovctx + 24 (oqsprov_keys.c:178)
11 oqsprovider.0.5.0-dev.dylib 0x10436e9a0 oqsprovider_teardown + 12 (oqsprov.c:553)
12 libcrypto.3.dylib 0x104547400 ossl_provider_free + 76
13 libcrypto.3.dylib 0x10459b5a0 OPENSSL_sk_pop_free + 60
14 libcrypto.3.dylib 0x104546ec0 ossl_provider_store_free + 72
15 libcrypto.3.dylib 0x10453b5c4 context_deinit_objs + 124
16 libcrypto.3.dylib 0x10453ae5c context_deinit + 32
17 libcrypto.3.dylib 0x10453ae2c ossl_lib_ctx_default_deinit + 20
18 libcrypto.3.dylib 0x10453dd80 OPENSSL_cleanup + 204
19 libsystem_c.dylib 0x1a1b55ed4 __cxa_finalize_ranges + 492
20 libsystem_c.dylib 0x1a1b55c4c exit + 44
21 libdyld.dylib 0x1a1cb0554 dyld4::LibSystemHelpers::exit(int) const + 20
22 dyld 0x1a193ff7c start + 2320
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 20 (20 by maintainers)
@baentsch - isn’t oqsprovider linked against libcrypto? The output shown in this comment suggests that is is:
https://github.com/openssl/openssl/issues/21023#issuecomment-1559479354
And it is making calls to libcrypto API functions. So the dynamic linker will “pick up” libcrypto when it loads the oqsprovider.
So, my assumption is that the Python is picking up the 3.1.0 libcrypto version (no debug symbols) and then loading the oqsprovider which is picking up the 3.2.0-dev libcrypto version. Chaos ensues.