openssl: atexit-registered OPENSSL_cleanup races with cleanup in other threads

Using OpenSSL 1.1.0h, the following program will segfault after running in a loop for 5-10 seconds:

#include <openssl/ssl.h>
#include <pthread.h>

void *bg(void *ptr) {
	SSL_CTX_free(ptr);
	return NULL;
}

int main() {
	pthread_t thread;
	SSL_CTX *ctx = SSL_CTX_new(TLS_method());
	SSL_CTX_set_default_verify_paths(ctx);

	pthread_create(&thread, NULL, bg, ctx);
}

Backtrace:

* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff7363464a libsystem_malloc.dylib`tiny_free_no_lock + 298
    frame #1: 0x00007fff73635256 libsystem_malloc.dylib`free_tiny + 628
    frame #2: 0x0000000106e9e71a libcrypto.1.1.dylib`OPENSSL_LH_delete + 67
    frame #3: 0x0000000106ea949d libcrypto.1.1.dylib`OBJ_NAME_remove + 104
    frame #4: 0x0000000106e9e8d0 libcrypto.1.1.dylib`doall_util_fn + 87
    frame #5: 0x0000000106ea96b5 libcrypto.1.1.dylib`OBJ_NAME_cleanup + 72
    frame #6: 0x0000000106e95ed9 libcrypto.1.1.dylib`evp_cleanup_int + 14
    frame #7: 0x0000000106e9cec5 libcrypto.1.1.dylib`OPENSSL_cleanup + 236
    frame #8: 0x00007fff73520ef4 libsystem_c.dylib`__cxa_finalize_ranges + 358
    frame #9: 0x00007fff735211fe libsystem_c.dylib`exit + 55
    frame #10: 0x00007fff7347401c libdyld.dylib`start + 8
    frame #11: 0x00007fff73474015 libdyld.dylib`start + 1
  thread #2, stop reason = signal SIGSTOP
    frame #0: 0x00007fff7378ace3 libsystem_pthread.dylib`pthread_rwlock_unlock + 7
    frame #1: 0x0000000106eee9ec libcrypto.1.1.dylib`CRYPTO_THREAD_unlock + 9
    frame #2: 0x0000000106e9a67a libcrypto.1.1.dylib`CRYPTO_free_ex_data + 103
    frame #3: 0x0000000106f065fd libcrypto.1.1.dylib`x509_cb + 153
    frame #4: 0x0000000106de34ad libcrypto.1.1.dylib`asn1_item_embed_free + 278
    frame #5: 0x0000000106de3391 libcrypto.1.1.dylib`ASN1_item_free + 25
    frame #6: 0x0000000106efb76b libcrypto.1.1.dylib`X509_OBJECT_free + 35
    frame #7: 0x0000000106eee869 libcrypto.1.1.dylib`OPENSSL_sk_pop_free + 46
    frame #8: 0x0000000106efb700 libcrypto.1.1.dylib`X509_STORE_free + 174
    frame #9: 0x0000000106d72d54 libssl.1.1.dylib`SSL_CTX_free + 200
    frame #10: 0x0000000106d57ead a.out`bg + 29
    frame #11: 0x00007fff7378c661 libsystem_pthread.dylib`_pthread_body + 340
    frame #12: 0x00007fff7378c50d libsystem_pthread.dylib`_pthread_start + 377
    frame #13: 0x00007fff7378bbf9 libsystem_pthread.dylib`thread_start + 13

Second backtrace:

* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x000000010b10336a libcrypto.1.1.dylib`OPENSSL_LH_strhash + 62
    frame #1: 0x000000010b10e786 libcrypto.1.1.dylib`obj_name_hash + 61
    frame #2: 0x000000010b1035fa libcrypto.1.1.dylib`getrn + 35
    frame #3: 0x000000010b1036f7 libcrypto.1.1.dylib`OPENSSL_LH_delete + 32
    frame #4: 0x000000010b10e49d libcrypto.1.1.dylib`OBJ_NAME_remove + 104
    frame #5: 0x000000010b1038d0 libcrypto.1.1.dylib`doall_util_fn + 87
    frame #6: 0x000000010b10e6b5 libcrypto.1.1.dylib`OBJ_NAME_cleanup + 72
    frame #7: 0x000000010b0faed9 libcrypto.1.1.dylib`evp_cleanup_int + 14
    frame #8: 0x000000010b101ec5 libcrypto.1.1.dylib`OPENSSL_cleanup + 236
    frame #9: 0x00007fff73520ef4 libsystem_c.dylib`__cxa_finalize_ranges + 358
    frame #10: 0x00007fff735211fe libsystem_c.dylib`exit + 55
    frame #11: 0x00007fff7347401c libdyld.dylib`start + 8
  thread #2, stop reason = signal SIGSTOP
    frame #0: 0x00007fff7378ace3 libsystem_pthread.dylib`pthread_rwlock_unlock + 7
    frame #1: 0x000000010b1539ec libcrypto.1.1.dylib`CRYPTO_THREAD_unlock + 9
    frame #2: 0x000000010b0ff67a libcrypto.1.1.dylib`CRYPTO_free_ex_data + 103
    frame #3: 0x000000010b12b794 libcrypto.1.1.dylib`RSA_free + 105
    frame #4: 0x000000010b0fc5a5 libcrypto.1.1.dylib`EVP_PKEY_free_it + 36
    frame #5: 0x000000010b0fc54b libcrypto.1.1.dylib`EVP_PKEY_free + 58
    frame #6: 0x000000010b16b175 libcrypto.1.1.dylib`pubkey_cb + 38
    frame #7: 0x000000010b0484ad libcrypto.1.1.dylib`asn1_item_embed_free + 278
    frame #8: 0x000000010b048686 libcrypto.1.1.dylib`asn1_template_free + 150
    frame #9: 0x000000010b048488 libcrypto.1.1.dylib`asn1_item_embed_free + 241
    frame #10: 0x000000010b048686 libcrypto.1.1.dylib`asn1_template_free + 150
    frame #11: 0x000000010b048488 libcrypto.1.1.dylib`asn1_item_embed_free + 241
    frame #12: 0x000000010b048391 libcrypto.1.1.dylib`ASN1_item_free + 25
    frame #13: 0x000000010b16076b libcrypto.1.1.dylib`X509_OBJECT_free + 35
    frame #14: 0x000000010b153869 libcrypto.1.1.dylib`OPENSSL_sk_pop_free + 46
    frame #15: 0x000000010b160700 libcrypto.1.1.dylib`X509_STORE_free + 174
    frame #16: 0x000000010afd2d54 libssl.1.1.dylib`SSL_CTX_free + 200
    frame #17: 0x000000010afb7eed a.out`bg + 29
    frame #18: 0x00007fff7378c661 libsystem_pthread.dylib`_pthread_body + 340
    frame #19: 0x00007fff7378c50d libsystem_pthread.dylib`_pthread_start + 377
    frame #20: 0x00007fff7378bbf9 libsystem_pthread.dylib`thread_start + 13

From what I’ve seen, the background thread always seems stopped inside of pthread_rwlock_unlock, but that may just be by chance.

These traces came from macOS using a Homebrew-installed OpenSSL, but that came from trying to reproduce what I think is the same crash on Linux using a manually built OpenSSL.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (15 by maintainers)

Commits related to this issue

Most upvoted comments

It seems kind of unreasonable to state that all programs linking to OpenSSL have to join all threads before returning from main or calling exit(3).

It’s not all that uncommon for threads to be left dangling. POSIX even allows creating unjoinable threads via pthread_attr_setdetachstate. Having OpenSSL be incompatible with that seems not ideal. This may also happen if the process happens to call exit while other threads are running.

At the very least, this is surprising behavior and ought to be documented as a usage requirement. Even then, shutdown behavior may be a few layers above OpenSSL usage in an application (some library uses another library which uses OpenSSL), so it’s not ideal.

There should probably be a pthread_join call in your “main” function before it exits. The atexit handler assumes that all threads have stopped. There might be a documentation issue here.