envoy: route match with tls client certificate does not work with session resumption

Title: route match with tls client certificate does not work with session resumption

Description: I configured optional mutual TLS (mTLS) based on the path using RouteMatch.TlsContextMatchOptions with Envoy 1.22.0. However, when cURL/OpenSSL client resumes TLS session, Envoy fails to route and client gets 404 (NR – no route configured). A fresh (TLS) connection works as expected (Envoy routes with client certificate and returns 404 without client certificate).

When the client resumes the TLS session, from Envoy accesslog, I can see it has %DOWNSTREAM_PEER_SUBJECT% logged correctly. I suspect that TlsContextMatchOptions does not take resumed session when validating the certificate presented and validated conditions (which was done when the original session was first created).

Repro steps: Envoy 1.22.0 with cURL 7.83.0 + OpenSSL 1.1.1n on MacOS. Use the test.py in https://gist.github.com/rafan/b5474d575d69346b3abcc3b7d381a6b1 to let cURL send request in 3 seconds apart. For the 2nd request, cURL will re-create the connection (as CURLOPT_MAXAGE_CONN is 1 second) and attempt TLS session resumption. Then, Envoy returns 404 :

* Too old connection (3 seconds idle), disconnect it
* Connection 0 seems to be dead                                                                   
* Closing connection 0                           
* Hostname 127.0.0.1 was found in DNS cache
*   Trying 127.0.0.1:29443...                                                                     
* Connected to 127.0.0.1 (127.0.0.1) port 29443 (#1)
* ALPN: offers h2
* ALPN: offers http/1.1
*  CAfile: /tmp/cacert
*  CApath: none
* SSL re-using session ID
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: C=US; ST=Denial; L=Springfield; O=Dis; CN=mysite.com
*  start date: May 11 09:27:36 2022 GMT
*  expire date: May 11 09:27:36 2023 GMT
*  issuer: C=US; ST=Denial; L=Springfield; O=Dis; CN=mysite.com
*  SSL certificate verify ok.
> GET /test_service HTTP/1.1
Host: 127.0.0.1:29443
User-Agent: PycURL/7.45.1 libcurl/7.83.0 (SecureTransport) OpenSSL/1.1.1n zlib/1.2.11 brotli/1.0.9 zstd/1.5.2 libidn2/2.3.2 libssh2/1.10.0 nghttp2/1.47.0 librtmp/2.3 OpenLDAP/2.6.1
Accept: */*
Content-type: application/json

* old SSL session ID is stale, removing
* old SSL session ID is stale, removing
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found 
< date: Wed, 11 May 2022 09:44:02 GMT
< server: envoy
< content-length: 0
< 
* Connection #1 to host 127.0.0.1 left intact

Admin and Stats Output: https://gist.github.com/rafan/b5474d575d69346b3abcc3b7d381a6b1

Config: https://gist.github.com/rafan/b5474d575d69346b3abcc3b7d381a6b1

Logs: https://gist.github.com/rafan/b5474d575d69346b3abcc3b7d381a6b1, access log:

[2022-05-11T09:29:18.761Z] "GET /test_service HTTP/1.1" 200 - 0 19 4 1 "-" "PycURL/7.45.1 libcurl/7.83.0 (SecureTransport) OpenSSL/1.1.1n zlib/1.2.11 brotli/1.0.9 zstd/1.5.2 libidn2/2.3.2 libssh2/1.10.0 nghttp2/1.47.0 librtmp/2.3 OpenLDAP/2.6.1" "bec7c827-ad10-452d-bf0d-66327bca2b6a" "127.0.0.1:29443" "172.27.0.2:8080" "CN=mysite.com,O=Dis,L=Springfield,ST=Denial,C=US" "-"
[2022-05-11T09:29:21.791Z] "GET /test_service HTTP/1.1" 404 NR 0 0 0 - "-" "PycURL/7.45.1 libcurl/7.83.0 (SecureTransport) OpenSSL/1.1.1n zlib/1.2.11 brotli/1.0.9 zstd/1.5.2 libidn2/2.3.2 libssh2/1.10.0 nghttp2/1.47.0 librtmp/2.3 OpenLDAP/2.6.1" "48f1133e-15c2-442e-8641-8441baf97791" "127.0.0.1:29443" "-" "CN=mysite.com,O=Dis,L=Springfield,ST=Denial,C=US" "-"

Call Stack: n/a

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 18 (14 by maintainers)

Commits related to this issue

Most upvoted comments

@ggreenway any thoughts on how require_client_certificate in downstream tls context worked in resumed TLS session? From cURL output, it still resumes session successfully without (re-)handshake.

I haven’t tested this, but my assumption is that a session would never get established if require_client_certificate is set and an invalid cert is presented, thus there would be nothing to resume.

Yes, documentation updates are a good idea.

And I agree that, because session-id resumption doesn’t exist in TLSv1.3 it’s not worth implementing a fix for that, which would only apply to older TLS versions.

It would not be difficult to add an option to disable session-id resumption also, in order to make this work in all cases.

It looks like presented works correctly with resumption, but validated does not. I’m trying to figure out if there’s a good way to fix it. If untrusted/unvalidated certs are allowed, I’m not sure if there’s a way to know on the resumed session whether the original cert was valid. If untrusted/unvalidated are disallowed, we could know just by virtue of a session resumption occuring that it was validated.