openssl: AFAICT openssl's strategy for handling TLS 1.3 session tickets makes it impossible to reliably implement communication patterns where the server never sends application-level data
I maintain a Python networking library called Trio, and I’ve been struggling to get it working with Openssl v1.1.1/TLS 1.3. We use openssl with memory BIOs and have an extensive test suite that passes with earlier openssl versions, but hits a number of problems after upgrading to v1.1.1. The main issue seems to be the session tickets that openssl sends after the handshake in server mode, and how they affect connections where the client never calls SSL_read.
Due diligence: I found these previous issues/PRs that are all about the exact same issue that I’m facing, but having read them carefully I still can’t figure out how to make this work: #6342, #6904, #6944. Also, for reference, this is the Trio bug: https://github.com/python-trio/trio/issues/819
TCP (as I understand it)
Let’s ignore TLS for the moment and just talk about TCP. Suppose we have a client that connects, sends some data, and then disconnects, without ever calling recv. In pseudo-code:
# Socket client
sock = connect(...)
while ...:
sock.send(...)
sock.close()
# Socket server (safe)
sock = accept()
while ...:
sock.recv(...)
if eof:
sock.close()
break
This isn’t a terribly common pattern, but it’s perfectly legal and reliable. Call this the safe pattern.
But, TCP has a gotcha: suppose we have the server send a bit of data, and change nothing else. In particular the client still never calls recv:
# Socket server (unsafe)
sock = accept()
sock.send(<one byte of data>) # <--- This is the only line that's different
while ...:
sock.recv(...)
if eof:
sock.close()
break
Now, this is a little funny looking, because the server is sending some data that the client will never read. But, whatever, that shouldn’t affect anything, right? This shouldn’t affect the data being sent from the client→server, right?
Well, that would be logical, but it’s wrong! In the unsafe pattern, an arbitrary amount of the data sent by the client can be lost, even though the client code didn’t change at all.
This happens because of arcane details of how TCP works: in the safe pattern, when the client calls close, the client’s kernel sends a FIN packet, and the server’s kernel queues that up behind all the other data in the server’s receive buffer, and everything proceeds in an orderly fashion. But in the unsafe pattern, the client has incoming data. And if there’s incoming data before or after a close, then RFC 1122 says:
If such a host issues a CLOSE call while received data is still pending in TCP, or if new data is received after CLOSE is called, its TCP SHOULD send a RST to show that data was lost.
So in this case, the client’s kernel might send a RST, instead of or in addition to the FIN. And then when the server’s kernel sees an RST packet, it discards all buffered data. So, if there’s any data that the client sent that’s still sitting in the server kernel buffers, it disappears forever.
References:
- https://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable
- http://blog.olivierlanglois.net/index.php/2010/02/06/tcp_rst_flag_subtleties
- The websocket RFC cites this as motivation for that protocol’s shutdown semantics: https://tools.ietf.org/html/rfc6455#section-1.4
What does this have to do with TLS?
Generally speaking, it should be possible to take any application that uses raw TCP, and switch it to use TLS-over-TCP instead, right? (With the notable exception of half-closed connections, but never mind.) So let’s port our client/server to use TLS:
# TLS-over-TCP client
tlssock = connect(...)
tlssock.do_handshake()
while ...:
tlssock.send(...)
tlssock.send_close_notify() # Often skipped in practice, but let's be standards-compliant
tlssock.close_tcp()
# TLS-over-TCP server ("safe")
tlssock = accept()
tlssock.do_handshake()
while ...:
tlssock.recv(...)
if eof:
tlssock.send_close_notify()
tlssock.close_tcp()
break
Now here’s the issue: with TLS 1.2 and earlier, if we follow the “safe pattern” at the application layer, like this, then openssl will ultimately translate that into the “safe pattern” at the TCP layer. But with TLS 1.3, openssl’s habit of sending session tickets after the handshake means that this exact same code now produces the unsafe pattern at the TCP layer. The server→client session tickets could cause the client→server application data to be lost.
#6944 changes how openssl reacts to getting notified of a client close while sending session tickets, so that the server can keep calling recv. But that doesn’t help if the kernel has already discarded the buffer that recv is trying to read out of. The problem here is all inside the TCP stacks; there’s nothing openssl can do about it, except avoid sending the session tickets in the first place.
There’s also a secondary problem, but it’s more theoretical: if the server→client buffer is small enough, then this code could deadlock at the handshake – the server’s call to SSL_do_handshake won’t return until the client calls SSL_read, but the client is calling SSL_send, which will eventually block until the server calls SSL_read, but the server can’t because it’s waiting for the client to call SSL_read… I’m not sure if there are any realistic cases where people use buffers that are small enough to trigger this though. (We have some torture tests with small buffers to flush out problems like this, which of course did catch it.)
What to do?
Of course we can disable session tickets, but this is difficult in our case because Python’s openssl bindings don’t expose SSL_set_num_tickets and SSL_CTX_set_session_ticket_cb. Also, as a generic networking library, we wouldn’t want to disable session tickets in general. But, we also really don’t want to have to explain to our users that enabling TLS is just a matter of switching from a SocketStream to a SSLStream, the APIs are identical except that if you happen to know that your client might not ever read data, then on the server side you have to call this special API before the handshake. That’s a super leaky abstraction.
We could do full bidirectional-shutdown, as suggested here. This seems problematic, though – the RFCs explicitly say that bidi shutdown is never required, bidi shutdown has always been useless before, and I’ve never heard of any existing software that does it. Trio doesn’t have any way to force its peers to use it. If OpenSSL is going to say that all TLS 1.3 implementations that want to interoperate with OpenSSL have to switch to bidi shutdown, then that seems like a huge ask. Also, isn’t TLS 1.3 supposed to reduce the number of round-trips we make?
The best idea I have is:
- don’t sent tickets automatically after the TLS 1.3 handshake
- automatically send tickets the first time the server calls
SSL_write - also provide an explicit
SSL_write_ticketsfunction to send tickets immediately
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 9
- Comments: 32 (24 by maintainers)
Links to this issue
Commits related to this issue
- Temporarily (!) disable TLS v1.3 in our test suite It's currently a mess because of how openssl v1.1.1 handles session tickets. There isn't consensus yet about whether this is an openssl bug or what:... — committed to njsmith/trio by njsmith 5 years ago
- Temporarily (!) disable TLS v1.3 in our test suite It's currently a mess because of how openssl v1.1.1 handles session tickets. There isn't consensus yet about whether this is an openssl bug or what:... — committed to njsmith/trio by njsmith 5 years ago
- tlscontext: add workaround for a TLS 1.3 bug to prevent data loss This is a workaround for an OpenSSL TLS 1.3 bug that results in data loss when one-way protocols are used and a connection is closed ... — committed to MrAnno/syslog-ng by MrAnno 4 years ago
- tlscontext: add workaround for a TLS 1.3 bug to prevent data loss This is a workaround for an OpenSSL TLS 1.3 bug that results in data loss when one-way protocols are used and a connection is closed ... — committed to balabit-deps/syslog-ng-6.0-core by alltilla 4 years ago
- Use bidirectional shutdown on default Closing the TCP connection after sending a TLS shutdown without reading on the TCP channel could lead to data loss, especially on write-only connections, like FT... — committed to mezen/Indy by mezen a year ago
For what it’s worth, this issue is not just hypothetical. The scenario described by https://github.com/openssl/openssl/issues/7948#issuecomment-562946766 is exactly what happens during an FTPS upload when the data connection uses TLSv1.3, where the server only ever reads, and the client only ever writes, resulting in an
ECONNRESETfrom the client seen by the server end. FTP has no other framing for indicating the end of the data stream, other than closing the data transfer TCP connection, for uploads.The workaround I used was to accept TLSv1.3 session tickets for the data connection TLS handshake, but not to renew them, so that the server does not do any TCP writes on the data connection for uploads using TLSv1.3.
@rdp Disabling tickets is quite tricky; you need to use both
SSL_CTX_set_num_ticketsand define a custom session ticket callback. And of course then you lose session tickets.We did find and deploy a workaround for this that I think makes TLS 1.3 work correctly on all openssl versions, while still supporting session tickets. Instead of letting openssl do I/O directly, we use MemoryBIO, and manually copy data to/from the socket. After
SSL_do_handshakecompletes, we check whether we’re running in server mode, and whether we negotiated TLS v1.3. If so, then we capture the last volley of data that openssl asked us to send, and we don’t send it – instead we stash it in an internal buffer, and send it the first time we have application data to write.This is convoluted and gross, but I believe it is a correct and full workaround. And unless the openssl devs decide to fix this bug and backport it to v1.1.1, I guess this is the only way to get correct TLS 1.3 support from openssl on RHEL 8.
Analysis of why this workaround is correct: https://github.com/python-trio/trio/issues/819#issuecomment-517529763
Example implementation: https://github.com/python-trio/trio/pull/1171
This can also come up in non-closing scenarios. Consider an HTTP/1.1 client. HTTP/1.1 never reads and writes at the same time. Typically, the client sends the entire HTTP request and only tries to read once it’s finished with that. This means OpenSSL will deadlock if neither the ticket (which scales with client cert size) nor the HTTP request (which is not under OpenSSL’s control) fit in transport receive buffers. BoringSSL defers the ticket write to the first
SSL_writeto avoid this problem.If you get reset, you won’t (reliably) receive the data. TCP treats resets as error conditions and will drop buffers, truncate sequence numbers, stop trying to retransmit, etc., when it happens. The first half of the bug report discusses this. We ran into a similar situation with trying to make client certificate alerts (which are now post-handshake from the client’s perspective in TLS 1.3) reliably delivered, though I think that one may be unfixable.
However, deferring the ticket write to the first
SSL_writeshould avoid the reset. It means TLS never (outside of some error conditions like client cert errors) triggers transport writes without application writes. That means TCP resets from an unexpected ticket if and only if it would have done so from unexpected application data anyway.This does mean the reset is avoided by saying a server which never writes never sends tickets, but I don’t see another option under this I/O pattern. Also protocols without server writes likely have no client reads, so the client won’t pick up the ticket anyway.
In general, the TLS library should avoid writing “out of turn” or it will confuse application-level handling of transport backpressure and resets. (See also mentions of surprising I/O patterns in https://github.com/openssl/openssl/issues/8677. I think OpenSSL only defers if the write buffer is busy. This may still cause TCP resets in this I/O pattern if the peer’s TLS library sends KeyUpdates transparently. BoringSSL unconditionally defers to
SSL_write.)Closing as it is possible to workaround it in the application by https://github.com/openssl/openssl/pull/11416 on 3.0.
1.1.1 is security-fix-only.
I think it should be the default behaviour and in 1.1.1
Oh, sorry, I just realized by “send something anyway”, you might have meant the
close_notify. I think that is fine as long as the server sendclose_notifyuntil after it’s done reading whatever it would have tried to read. Then the reset won’t drop anything because it’s all been read. But this is also super weird. IMO TCP is being overaggressive in classifying out-of-turn writes as an error condition. Or perhaps it needs a notion of droppable “FIN data”. But it’s a bit late to fix that. 😕On the topic of
close_notify, some comments to the earlier discussion:It’s true that ignoring
close_notifyis, sadly, required for interop with the HTTPS ecosystem because a lot of HTTPS servers fail to sendclose_notify. (As recently as a few years ago, I accidentally made Chrome enforceclose_notifyduring a refactor. This broke loads of sites and had to get quickly reverted.)However, there is no interop requirement to not send
close_notify. It is good to send it (or, for HTTP, exclusively use one of the self-delimited options, see below) so you don’t contribute to the interop problem.HTTP/2 frames are self-delimited and streams have an explicit close, so there isn’t a truncation problem there. Transport EOF should not be considered a clean stream closure, whether or not it’s authenticated.
HTTP/1.1 is a complex mess of a text protocol and has potential truncation problems. First, the header block is delimited by two newlines. It is important that an HTTPS parser either enforce
close_notify(which alas is not interoperable) or enforce the two newlines. Otherwise it is vulnerable to a truncation attack.Second, there are three ways to delimit message bodies:
Content-Lengthheader, count up that many bytes.Transfer-Encoding: chunkedheader, each chunk has a length prefix and there is an explicit end chunk.(1) and (2) are self-delimited. You don’t need
close_notify, provided you do not treat transport EOF as a substitute for the in-band termination signal.(3) has truncation problems and needs
close_notify. Sadly, that is also exactly the case where clients hit interop issues if they enforceclose_notify. 😦 Thus servers should never use (3). Use (1) or (2) or HTTP/2. This is necessary for connection reuse too.Text protocols are the worst.
In summary, the problem is that the client might have send all it’s data, send a close_notify, and closed the connection before the server tries send the session ticket. The server tries to write the tickets before it reads the data from the client. When the server writes the tickets after the client closed the connection, the TCP connection will be reset, and the server can’t actually read what the client send.
The solution is not to write any ticket as part of the handshake, but at a later time when we would send something anyway. We might still get reset, but we would have received all the data. It also doesn’t change anything for a client that does a bidirectional shutdown.