go: Panic in crypto/tls/common.go

What version of Go are you using (go version)?

go version go1.9 linux/arm64

Does this issue reproduce with the latest release?

We believe so but cannot currently confirm as it is very difficult to reproduce under controlled conditions (see below).

What operating system and processor architecture are you using (go env)?

linux, arm64

What did you do?

We are running a proxy which performs MITM interception on behalf of the user and typically is servicing hundreds of requests in parallel. We manage the certificate store by inserting certificates in the NameToCertificate map. Maps are not concurrency safe, so if a TLS handshake is occurring while a certificate is being inserted into NameToCertificate, a panic results. We avoid this in our code by protecting NameToCertificate with a mutex, but this is not possible in the native Golang code.

This error occurs randomly and we see it about once a week but have not been able to reproduce yet.

What did you expect to see?

We would like for our program to continue running, even if the handshake crashes. This could potentially be done by adding a recover() to the code in crypto/common.go around line 737:

    if cert, ok := c.NameToCertificate[name]; ok {
            return cert, nil
    }

What did you see instead?

The program crashes. This occurs even with the initial TLS handshake() protected by a recover. Stack trace:

fatal error: concurrent map read and map write

goroutine 11597 [running]: runtime.throw(0xb64eb7, 0x21) /usr/local/go/src/runtime/panic.go:605 +0x70 fp=0x44204919c0 sp=0x44204919a0 pc=0x42a600 runtime.mapaccess2_faststr(0xa6a400, 0x44202c0c00, 0x4429418a80, 0x18, 0x0, 0x4) /usr/local/go/src/runtime/hashmap_fast.go:324 +0x488 fp=0x4420491a20 sp=0x44204919c0 pc=0x40d248 crypto/tls.(*Config).getCertificate(0x442023ec00, 0x4427736160, 0x2, 0x0, 0x0) /usr/local/go/src/crypto/tls/common.go:737 +0xc4 fp=0x4420491aa0 sp=0x4420491a20 pc=0x5ab614 crypto/tls.(*serverHandshakeState).readClientHello(0x4420491c38, 0x4420491c28, 0x1138740, 0x577a84) /usr/local/go/src/crypto/tls/handshake_server.go:216 +0x370 fp=0x4420491bf0 sp=0x4420491aa0 pc=0x5bd2f0 crypto/tls.(*Conn).serverHandshake(0x4428a0ce00, 0xb8ce98, 0x4428a0cf20) /usr/local/go/src/crypto/tls/handshake_server.go:48 +0x68 fp=0x4420491d10 sp=0x4420491bf0 pc=0x5bcc88 crypto/tls.(*Conn).Handshake(0x4428a0ce00, 0x0, 0x0)

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 18 (4 by maintainers)

Most upvoted comments

If you need crypto/tls to use your mutexes, the answer is to define the GetCertificate func:

https://golang.org/pkg/crypto/tls/#Config.GetCertificate

It is used before NameToCertificate and you can do any locking in there.

Are you building with a fork of the Go 1.9 source and a patch applied? Have you tried Go 1.10 or 1.11?

Also note the Config docs: https://golang.org/pkg/crypto/tls/#Config

A Config structure is used to configure a TLS client or server. After one has been passed to a TLS function it must not be modified. A Config may be reused; the tls package will also not modify it.

Good to hear! But with my suggestion, I hope you are making a full copy of the map, instead of what I suggested where in setTlsConfig the line tmp := theTlsConfig only copies a reference to the NameToCertificateMap

I’m trying to say: maybe you can just push the mutex one level out in Go?

This won’t work because the original Golang code in tls/common.go does not use a lock. We are already locking in our code to avoid the panic but it crashes inside tls/common.go while doing the TLS handshake.

From your example, as @adamdecaf noted, you’re locks aren’t locking the whole TestCaConfig object. I think it would work if you modified it so that every time a domain is inserted, a whole new tls.Config is created or copied from the old one.

To be more explicit, suppose we add a global variable theTlsConfig to your example, and an associated mutex theTlsMutex.

line 97 would have something like

TLSClientConfig: getTlsConfig()

where getTlsConfig is a function you define which returns the current TlsConfig with respect to all such configs that have been created (one for each insertion).

Then lines 73-75 would have to be replaced by something like

 setTlsConfig(domain, tlsc)

Then

func getTlsConfig() *tls.Config {
   theTlsMutex.Lock()
   defer theTlsMutex.Unlock()
   return &theTlsConfig
}
fun setTlsConfig(domain, tlsc) {  // sorry I didn't get the types for domain, tlsc but I think you get the point...
    theTlsMutex.Lock()
    defertheTlsMutex.Unlock()
    tmp := theTlsConfig
    tmp.NameToCertificateMap[domain] = tlsc
    theTlsConfig = tmp
}

It is awkward, but I would guess that would work for the example

disclaimer: I didn’t run it or test a variation, just seems like it would work.