http-kit: SSL error on `doRead`
I’m getting javax.crypto.BadPaddingException
exception which may be caused by synchronization. I’m not sure if this bug can be fixed on httpkit or just by updating java version 🤔
I’m using
- http-kit 2.5.3
- java openjdk version “11.0.11” 2021-04-20
Exception stack
javax.crypto.BadPaddingException: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than or equal to IV size (8) + tag size (16)
?, in sun.security.ssl/decrypt
?, in sun.security.ssl/decodeInputRecord
?, in sun.security.ssl/decode
?, in sun.security.ssl/decode
?, in sun.security.ssl/decode
?, in sun.security.ssl/decode
?, in sun.security.ssl/readRecord
?, in sun.security.ssl/unwrap
?, in sun.security.ssl/unwrap
?, in javax.net.ssl/unwrap
File "HttpsRequest.java", line 35, in org.httpkit.client/unwrapRead
while ((res = engine.unwrap(peerNetData, peerAppData)).getStatus() == Status.OK) {
File "HttpClient.java", line 191, in org.httpkit.client/doRead
read = httpsReq.unwrapRead(buffer);
File "HttpClient.java", line 494, in org.httpkit.client/run
doRead(key, now);
?, in java.lang/run
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 15 (2 by maintainers)
Commits related to this issue
- properly unrecycle the req when kept-alive conn wasn't able to be reused Fixes the SSLProtocolException "Input record too big" errors reported in #469. Moves more logic into recycle/unrecycle to make... — committed to xwang1498/http-kit by xwang1498 2 years ago
- properly unrecycle the req when kept-alive conn wasn't able to be reused Fixes the SSLProtocolException "Input record too big" errors reported in #469. Moves more logic into recycle/unrecycle to make... — committed to xwang1498/http-kit by xwang1498 2 years ago
- properly unrecycle the req when kept-alive conn wasn't able to be reused Fixes the SSLProtocolException "Input record too big" errors reported in #469. Moves more logic into recycle/unrecycle to make... — committed to xwang1498/http-kit by xwang1498 2 years ago
- properly unrecycle the req when kept-alive conn wasn't able to be reused Fixes the SSLProtocolException "Input record too big" errors reported in #469. Moves more logic into recycle/unrecycle to make... — committed to xwang1498/http-kit by xwang1498 2 years ago
- properly unrecycle the req when kept-alive conn wasn't able to be reused Fixes the SSLProtocolException "Input record too big" errors reported in #469. Moves more logic into recycle/unrecycle to make... — committed to xwang1498/http-kit by xwang1498 2 years ago
- [#469 #489] [Client] Properly unrecycle req when kept-alive conn wasn't able to be reused (@xwang1498) Fixes nasty client bug that could lead to "Input record too big" SSLProtocolException errors. t... — committed to http-kit/http-kit by xwang1498 2 years ago
@miikka @huima @seancorfield I think I’ve finally figured out what’s going on here, and it’s a nasty (but fixable!) bug in http-kit.
tl;dr
If re-using a kept-alive connection fails for some reason (e.g. the remote side closed it), http-kit will incorrectly use the old ssl engine when making a new connection.
Longer explanation
With the assistance of Wireshark, I found this sequence of events:
HTTP/1.1 400 Bad Request
and close this new connection.SSLProtocolException: Input record too big: max = 16709 len = 20532
error (see above for why it’s 20532).This bug probably affects tons of people, because a kept-alive HTTPS connection getting closed would be a very common occurrence.
I’ve created pull request #489 that tries to address the problem.
We’re also seeing this from time to time in production, without the SNI enabled configurer, without a specific client instance created, on a plain
post
that we dereference “immediately” before continuing on. So it’s just reusing the default client singleton.We’re currently on a slightly older JDK 11 so it could be that underlying issue for us (we’re in the process of updating to the latest JDK 11 but plan to go to JDK 17 “soon”).
I’m going to update our code to use a fresh client instance for each call site to see if that reduces the occurrences of the error (to rule out some level of concurrency issues – although one call site is pretty high-traffic).
Hi @miikka , did you ever figure out what’s going on here?
I think I have a partial explanation. A Cloudflare-hosted service we talk to sometimes returns a plain unencrypted http response (e.g. the raw bytes
HTTP/1.1 400 Bad Request ...
), despite it being a TLS connection. The TLS stack interprets these bytes as a TLS record, where bytes 3 & 4 (theP/
inHTTP/1.1
) are the length, plus 5 bytes for the header. 0x502f + 5 = 20532 bytes!@huima I can only point out that CloudFront and CloudFlare seem to have several differences in behavior.
@xwang1498, I never figured it out but your explanation makes a lot of sense! We didn’t have Cloudflare, but I can’t rule out some other service returning plain HTTP requests in some cases. Personally I’ve moved on but @huima check this out, you might find it interesting.
This reminded me I should have followed up on my report from September 2021: we ended up switching from
http-kit
to Hato because of this, after trying fresh client instances and also updating our JDK – and, yes, CloudFlare was in the mix for us as well.