reactor-netty: javax.net.ssl.SSLException: handshake timed out
System Architecture
We as client can communicate to different devices (servers) with different ip addresses. All the servers share a common root certificate to expose it as TLS, but unique key/ keystore per device.
Expected Behavior
No Handshake time out should occur.
When making a call to few devices concurrently (let say less than some threshold 4) error will not occur, but if we invoke calls concurrently more than this threshold , getting to see handshake timeout issues.
Debugged with option -Djavax.net.debug=ssl , still no luck on figuring out why the issue is happening.
There is no problem with the server, when we try like one on one with a server, we never encountered handshake timeout issue, but when tried on multiple servers concurrently few error out and few get success in handshake.
I think there is some concurrency issues going on in reactor netty, unable to figure out where. Please share architecture diagram if there is any for reactory netty .
Any pointers would be helpful to resolve this issue.
Actual Behavior
Getting Below error:
2019-11-20 13:07:22,108 361283 [reactor-http-epoll-8] WARN r.n.http.client.HttpClientConnect - [id: 0xab3f418a, L:/<ip1>:40554 - R:<ip2>/<ip2>:5000] The connection observed an
error
javax.net.ssl.SSLException: handshake timed out
at io.netty.handler.ssl.SslHandler$5.run(SslHandler.java:2011)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:150)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:413)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
reactor-http-epoll-8, called closeOutbound()
reactor-http-epoll-8, closeOutboundInternal()
reactor-http-epoll-8, SEND TLSv1.2 ALERT: warning, description = close_notify
reactor-http-epoll-8, WRITE: TLSv1.2 Alert, length = 26
reactor-http-epoll-8, called closeInbound()
reactor-http-epoll-8, fatal error: 80: Inbound closed before receiving peer's close_notify: possible truncation attack?
javax.net.ssl.SSLException: Inbound closed before receiving peer's close_notify: possible truncation attack?
%% Invalidated: [Session-1010, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256]
reactor-http-epoll-8, SEND TLSv1.2 ALERT: fatal, description = internal_error
reactor-http-epoll-8, Exception sending alert: java.io.IOException: writer side was already closed.
2019-11-20 13:07:22,110 361285 [org.springframework.kafka.KafkaListenerEndpointContainer#1-1-C-1] ERROR com.bmg.service.HttpService - javax.net.ssl.SSLException: handshake timed out, {}
reactor.core.Exceptions$ReactiveException: javax.net.ssl.SSLException: handshake timed out
at reactor.core.Exceptions.propagate(Exceptions.java:326)
at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:91)
at reactor.core.publisher.Mono.block(Mono.java:1494)
client make a call to server gets HandshakeTimeout after around 1.15 min to 1.40 min
Since default HandshakeTimeout is 10 secs, also tried setting below system variable property , but still handshake timeout occurs.
-Dreactor.netty.tcp.sshHandshakeTimeout=120000
HandshakeTimeout didn’t occur after 2 minutes are per above config, it occurred around same range 1.15 min to 1.40 min,
Even though error says handshakeTimeout, it feels like this call is being internally queued and tried after certain time and then handshakeTimeout occurs.
Steps to Reproduce
JdkSsl context being used by reactor netty. Getting HttpClient as below: Note: getting a newConnection (instead of HttpClient.create() ) else there is weird concurrency problem going on, instead of hitting one server it’s hitting different server and also used to get reactor.netty.http.client.PrematureCloseException (Reference: https://projectreactor.io/docs/netty/release/reference/index.html#_connect) hence using newConnection.
public HttpClient getHttpClient(SslContext sslContext, int connectTimeOutInMilliSeconds,
int readTimeOutInMilliSeconds) {
HttpClient httpClient = HttpClient.newConnection().tcpConfiguration(tcpClient ->
tcpClient
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectTimeOutInMilliSeconds)
.doOnConnected(connection -> connection
.addHandlerLast(new ReadTimeoutHandler(readTimeOutInMilliSeconds,
TimeUnit.MILLISECONDS))
.addHandlerLast((new WriteTimeoutHandler(readTimeOutInMilliSeconds,
TimeUnit.MILLISECONDS)))));
if (sslContext != null) {
httpClient = httpClient.secure(sslContextSpec -> sslContextSpec.sslContext(sslContext));
}
return httpClient;
}
Getting sslContext as below:
private SslContext getTrustAllSslWebClient() {
try {
return SslContextBuilder
.forClient()
.trustManager(InsecureTrustManagerFactory.INSTANCE)
.build();
} catch (SSLException e) {
//ignore
}
}
Minimal yet complete reproducer code (or URL to code)
This is difficult to reproduce without complete production setup.
Possible Solution
Your Environment
-
Reactor version(s) used: reactor-netty :0.8.13:RELEASE
-
Other relevant libraries versions (eg.
netty
, …): netty -> 4.1.43.FINAL derived based on spring boot parent version given below
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.10.RELEASE</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
JVM version (e.g. java -version
)
openjdk version “1.8.0_222” Also tried on java11.
OS version (e.g. uname -a
)
4.15.0-66-generic #75-Ubuntu
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 5
- Comments: 20 (6 by maintainers)
Closing this issue as there is no enough information in order to proceed with the investigation
Hi, We are facing the same issue. Is the solution to set reactor.netty.ioWorkerCount to a higher value like 128? Also, how to set this value (reactor.netty.ioWorkerCount=128)? We are using Java SpringBoot Webflux code using webclient.