openssl: Fatal - Unexpected Message returned with OpenSSL 3.0
Since migrating to OpenSSL 3.0 we are experiencing intermittent issues in TLS handshakes.
Old env: Ubuntu 21.10 / Postfix 3.5.6 / OpenSSL 1.1.1l New env: Ubuntu 22.04 / Postfix 3.6.4 / OpenSSL 3.0.2 (daily updated to latest available patches)
Narrowed this down to Java / JavaMail clients connecting to the Postfix service.
Network trace shows that any time Postfix responds to a ClientHello containing a SessionId with a ServerHello with session_id_length = 0, the client returns a a Fatal alert, unexpected_message.
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 46 (19 by maintainers)
Commits related to this issue
- Don't use a non-resumable session if it is returned from the cache We were incorrectly handling the case where the server side cache returns a non-resumable session. We ended up attempting to use the... — committed to mattcaswell/openssl by mattcaswell 2 years ago
- Free up space in the session cache before adding. Fixes #18690 In some circumstances, it's possible that when using an external database for the session cache, that pulling in an entry from that cac... — committed to tmshort/openssl by tmshort 2 years ago
- Free up space in the session cache before adding. Fixes #18690 In some circumstances, it's possible that when using an external database for the session cache, that pulling in an entry from that cac... — committed to tmshort/openssl by tmshort 2 years ago
- Free up space in the session cache before adding. Fixes #18690 In some circumstances, it's possible that when using an external database for the session cache, that pulling in an entry from that cac... — committed to openssl/openssl by tmshort 2 years ago
- Don't use a non-resumable session if it is returned from the cache We were incorrectly handling the case where the server side cache returns a non-resumable session. We ended up attempting to use the... — committed to mattcaswell/openssl by mattcaswell 2 years ago
- Free up space in the session cache before adding. Fixes #18690 In some circumstances, it's possible that when using an external database for the session cache, that pulling in an entry from that cac... — committed to sftcd/openssl by tmshort 2 years ago
No need. I’ve identified the issue. It is an OpenSSL 3.0 session cache management regression.
You don’t need to downgrade Postfix or OpenSSL. It suffices to just set:
which is its default and recommended value. Yes, clients that don’t support tickets won’t be able to make effective use of TLS resumption, but that’s unlikely to be a major issue under realistic message rates. See the follow-up to the “postfix-users” list.
Shared the
-s 0pcap (3.7MB) via email.Checked a failing transmission again, and the
ec_point_formatsdon’t show up in Wireshark’s UI for theServerHello, e.g. frame 206 in the pcap. The only extension I see isrenegotiation_info. The “random” in that ServerHello seems odd, rendered as text it showsDDOWNGRD?Let me know if there’s anything I can do to help here.
A successful
Server Helloresponse does contain theec_point_formats. Note that here the session id is echoed.I know barely enough to be dangerous, clueless would not surprise me! It was a bit of a crash job when I created these servers some years ago as a Postfix noob. The relevant Postfix TLS config (from our code repo) below.
I’ll need to shorten de pcap a bit to send via email, will get to that tomorrow! I can then check the
postconfoutput as well. The code repo has neithertls_ssl_optionsnorsmtpd_tls_always_issue_session_ids.Again, thank you very much for spending your valuable time on this!
Indeed the Postfix
tlsproxysupports multiple concurrent non-blocking SSL connections in a single thread via an event loop. The currentnot_resumablebit is a fragile mechanism that needs rework, but the internal cache is now exhibiting a few too many warts and needs some TLC. 😦If neither @tmshort nor @mattcaswell volunteers to craft the PR the way they prefer, I might have to do it myself, but full disclosure, I might be inclined to rip out the time ordering code. Is it demonstrably a win? What is its justification?
I don’t see a compelling case made in #8687 as to why it was a good idea.
Technical nit: Postfix does not respond to
ClientHellomessages, OpenSSL does.That aside, you’ve not been very specific in your error report:
supported_versionsextension, or only TLS 1.2?tsharkdecodes of the client and server TLS HELLO messages? (Please collect and post, or else provide a full PCAP capture of a failed handshake, without packet truncation, i.e.-s 0option oftcpdump).With TLS 1.3, the server needs to echo the client’s “legacy” session id. With TLS 1.2, the server signals non-support for session resumption by returning an empty session-id.
If the Java clients are refusing to interoperate with empty session ids in the server response, barring additional evidence, I rather think the bug is on the Java side. A PCAP file of the failed handshake would be useful here, if you’re willing to “leak” the IP addresses and server SNI involved.