openssl: atexit-registered OPENSSL_cleanup races with cleanup in other threads
Using OpenSSL 1.1.0h, the following program will segfault after running in a loop for 5-10 seconds:
#include <openssl/ssl.h>
#include <pthread.h>
void *bg(void *ptr) {
SSL_CTX_free(ptr);
return NULL;
}
int main() {
pthread_t thread;
SSL_CTX *ctx = SSL_CTX_new(TLS_method());
SSL_CTX_set_default_verify_paths(ctx);
pthread_create(&thread, NULL, bg, ctx);
}
Backtrace:
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x00007fff7363464a libsystem_malloc.dylib`tiny_free_no_lock + 298
frame #1: 0x00007fff73635256 libsystem_malloc.dylib`free_tiny + 628
frame #2: 0x0000000106e9e71a libcrypto.1.1.dylib`OPENSSL_LH_delete + 67
frame #3: 0x0000000106ea949d libcrypto.1.1.dylib`OBJ_NAME_remove + 104
frame #4: 0x0000000106e9e8d0 libcrypto.1.1.dylib`doall_util_fn + 87
frame #5: 0x0000000106ea96b5 libcrypto.1.1.dylib`OBJ_NAME_cleanup + 72
frame #6: 0x0000000106e95ed9 libcrypto.1.1.dylib`evp_cleanup_int + 14
frame #7: 0x0000000106e9cec5 libcrypto.1.1.dylib`OPENSSL_cleanup + 236
frame #8: 0x00007fff73520ef4 libsystem_c.dylib`__cxa_finalize_ranges + 358
frame #9: 0x00007fff735211fe libsystem_c.dylib`exit + 55
frame #10: 0x00007fff7347401c libdyld.dylib`start + 8
frame #11: 0x00007fff73474015 libdyld.dylib`start + 1
thread #2, stop reason = signal SIGSTOP
frame #0: 0x00007fff7378ace3 libsystem_pthread.dylib`pthread_rwlock_unlock + 7
frame #1: 0x0000000106eee9ec libcrypto.1.1.dylib`CRYPTO_THREAD_unlock + 9
frame #2: 0x0000000106e9a67a libcrypto.1.1.dylib`CRYPTO_free_ex_data + 103
frame #3: 0x0000000106f065fd libcrypto.1.1.dylib`x509_cb + 153
frame #4: 0x0000000106de34ad libcrypto.1.1.dylib`asn1_item_embed_free + 278
frame #5: 0x0000000106de3391 libcrypto.1.1.dylib`ASN1_item_free + 25
frame #6: 0x0000000106efb76b libcrypto.1.1.dylib`X509_OBJECT_free + 35
frame #7: 0x0000000106eee869 libcrypto.1.1.dylib`OPENSSL_sk_pop_free + 46
frame #8: 0x0000000106efb700 libcrypto.1.1.dylib`X509_STORE_free + 174
frame #9: 0x0000000106d72d54 libssl.1.1.dylib`SSL_CTX_free + 200
frame #10: 0x0000000106d57ead a.out`bg + 29
frame #11: 0x00007fff7378c661 libsystem_pthread.dylib`_pthread_body + 340
frame #12: 0x00007fff7378c50d libsystem_pthread.dylib`_pthread_start + 377
frame #13: 0x00007fff7378bbf9 libsystem_pthread.dylib`thread_start + 13
Second backtrace:
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x000000010b10336a libcrypto.1.1.dylib`OPENSSL_LH_strhash + 62
frame #1: 0x000000010b10e786 libcrypto.1.1.dylib`obj_name_hash + 61
frame #2: 0x000000010b1035fa libcrypto.1.1.dylib`getrn + 35
frame #3: 0x000000010b1036f7 libcrypto.1.1.dylib`OPENSSL_LH_delete + 32
frame #4: 0x000000010b10e49d libcrypto.1.1.dylib`OBJ_NAME_remove + 104
frame #5: 0x000000010b1038d0 libcrypto.1.1.dylib`doall_util_fn + 87
frame #6: 0x000000010b10e6b5 libcrypto.1.1.dylib`OBJ_NAME_cleanup + 72
frame #7: 0x000000010b0faed9 libcrypto.1.1.dylib`evp_cleanup_int + 14
frame #8: 0x000000010b101ec5 libcrypto.1.1.dylib`OPENSSL_cleanup + 236
frame #9: 0x00007fff73520ef4 libsystem_c.dylib`__cxa_finalize_ranges + 358
frame #10: 0x00007fff735211fe libsystem_c.dylib`exit + 55
frame #11: 0x00007fff7347401c libdyld.dylib`start + 8
thread #2, stop reason = signal SIGSTOP
frame #0: 0x00007fff7378ace3 libsystem_pthread.dylib`pthread_rwlock_unlock + 7
frame #1: 0x000000010b1539ec libcrypto.1.1.dylib`CRYPTO_THREAD_unlock + 9
frame #2: 0x000000010b0ff67a libcrypto.1.1.dylib`CRYPTO_free_ex_data + 103
frame #3: 0x000000010b12b794 libcrypto.1.1.dylib`RSA_free + 105
frame #4: 0x000000010b0fc5a5 libcrypto.1.1.dylib`EVP_PKEY_free_it + 36
frame #5: 0x000000010b0fc54b libcrypto.1.1.dylib`EVP_PKEY_free + 58
frame #6: 0x000000010b16b175 libcrypto.1.1.dylib`pubkey_cb + 38
frame #7: 0x000000010b0484ad libcrypto.1.1.dylib`asn1_item_embed_free + 278
frame #8: 0x000000010b048686 libcrypto.1.1.dylib`asn1_template_free + 150
frame #9: 0x000000010b048488 libcrypto.1.1.dylib`asn1_item_embed_free + 241
frame #10: 0x000000010b048686 libcrypto.1.1.dylib`asn1_template_free + 150
frame #11: 0x000000010b048488 libcrypto.1.1.dylib`asn1_item_embed_free + 241
frame #12: 0x000000010b048391 libcrypto.1.1.dylib`ASN1_item_free + 25
frame #13: 0x000000010b16076b libcrypto.1.1.dylib`X509_OBJECT_free + 35
frame #14: 0x000000010b153869 libcrypto.1.1.dylib`OPENSSL_sk_pop_free + 46
frame #15: 0x000000010b160700 libcrypto.1.1.dylib`X509_STORE_free + 174
frame #16: 0x000000010afd2d54 libssl.1.1.dylib`SSL_CTX_free + 200
frame #17: 0x000000010afb7eed a.out`bg + 29
frame #18: 0x00007fff7378c661 libsystem_pthread.dylib`_pthread_body + 340
frame #19: 0x00007fff7378c50d libsystem_pthread.dylib`_pthread_start + 377
frame #20: 0x00007fff7378bbf9 libsystem_pthread.dylib`thread_start + 13
From what I’ve seen, the background thread always seems stopped inside of pthread_rwlock_unlock, but that may just be by chance.
These traces came from macOS using a Homebrew-installed OpenSSL, but that came from trying to reproduce what I think is the same crash on Linux using a manually built OpenSSL.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (15 by maintainers)
Commits related to this issue
- feat: re-enable offset optimization NOTE: During work on this patch, the application faulted very early in processing. This may have been due to a library issue around libssl setting an `_atexit()` h... — committed to mozilla-services/syncstorage-rs by jrconlin 4 years ago
- workaround openssl shutdown race condition refs: https://github.com/openssl/openssl/issues/6214 refs: https://github.com/sfackler/rust-openssl/pull/1324 — committed to wez/libssh-rs by wez a year ago
- wezterm-ssh: fix occasional segv on libssh shutdown The shutdown is due to an openssl race condition: refs: https://github.com/openssl/openssl/issues/6214 which is worked around by explicitly tellin... — committed to wez/wezterm by wez a year ago
It seems kind of unreasonable to state that all programs linking to OpenSSL have to join all threads before returning from main or calling
exit(3).It’s not all that uncommon for threads to be left dangling. POSIX even allows creating unjoinable threads via
pthread_attr_setdetachstate. Having OpenSSL be incompatible with that seems not ideal. This may also happen if the process happens to callexitwhile other threads are running.At the very least, this is surprising behavior and ought to be documented as a usage requirement. Even then, shutdown behavior may be a few layers above OpenSSL usage in an application (some library uses another library which uses OpenSSL), so it’s not ideal.
There should probably be a pthread_join call in your “main” function before it exits. The atexit handler assumes that all threads have stopped. There might be a documentation issue here.