reactor-netty: syscall:read(..) failed: Connection reset by peer
Actual behavior
@Bean
public WebClient webClient(ReutersSetting reutersSetting, ExchangeStrategies exchangeStrategies) {
return WebClient.builder()
.baseUrl(ReutersEndPoints.HOST)
.defaultHeader(HEADER_APP_ID, reutersSetting.getApplicationId())
.exchangeStrategies(exchangeStrategies)
.build();
}
@Bean
public ExchangeStrategies exchangeStrategies() {
ObjectMapper mapper = objectMapper();
return ExchangeStrategies
.builder()
.codecs(clientDefaultCodecsConfigurer -> {
clientDefaultCodecsConfigurer.defaultCodecs().jackson2JsonEncoder(new Jackson2JsonEncoder(mapper, MediaType.APPLICATION_JSON));
clientDefaultCodecsConfigurer.defaultCodecs().jackson2JsonDecoder(new Jackson2JsonDecoder(mapper, MediaType.APPLICATION_JSON));
}).build();
}
public ObjectMapper objectMapper() {
return Jackson2ObjectMapperBuilder
.json()
.failOnUnknownProperties(false)
.featuresToEnable(SerializationFeature.WRAP_ROOT_VALUE)
.featuresToEnable(DeserializationFeature.UNWRAP_ROOT_VALUE)
.build();
}
WebClient
initiated via above configuration, throws Exception occasionally such as below.
2019-01-07 11:33:22.188 ERROR [-,,,] 92270 --- [reactor-http-epoll-4] r.n.resources.PooledConnectionProvider : [id: 0x6f488001, L:/xx.xx.xx.xx:53500 - R:api.trkd.thomsonreuters.com/xx.xx.xx.xx:443] Pooled connection observed an error
io.netty.channel.unix.Errors$NativeIoException: syscall:read(..) failed: Connection reset by peer
at io.netty.channel.unix.FileDescriptor.readAddress(..)(Unknown Source)
WebClient is called regularly under Scheduled
task.
Steps to reproduce
Happen randomly
Reactor Netty version
reactor-netty:0.8.3.RELEASE
JVM version (e.g. java -version
)
openjdk version “11” 2018-09-25 OpenJDK Runtime Environment 18.9 (build 11+28) OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
OS version (e.g. uname -a
)
Linux xxxx 3.10.0-862.9.1.el7.x86_64 #1 SMP Mon Jul 16 16:29:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 12
- Comments: 36 (9 by maintainers)
Most of the issues here were caused by a proxy/server that closes the connections on some timeout. Because of this we are going to expose a configuration property for switching pool’s lease strategy from FIFO to LIFO (FIFO is by default) #962
Using LIFO leasing strategy + max idle timeout will give you the behaviour below
Connection reset by peer
will be received and we will retry the request. As this connection was the most recently used and it was closed by the remote peer this mean all the rest (those that are not active) in the pool also will be closed thus again a new connection will be used for the second attempt.I have the same issue in my spring boot project:
ERROR [reactor-http-epoll-1] [reactor.core.publisher.Operators] Operator called default onErrorDropped io.netty.channel.unix.Errors$NativeIoException: syscall:read(…) failed: Connection reset by peer at io.netty.channel.unix.FileDescriptor.readAddress(…)(Unknown Source)
If I can help you with some additional debug info fill free to ask. Waiting for fix
It was ERROR on the client side. I understood the problem. I use Swarm to orchestrate my dockerized applications. Swarm has a load balancing (LB) to implement its routing mesh. LB closes inactive connections. Enabling of keep alive for channels was not enough because it started after connection closing. Implementation of keep alive on application level for idle connections solved the problem.
We still have a lot of problems due to invalid haproxy configuration… This post helped us a lot to clear most of the issues. Unfortunately we still need to deal with problems on the health cheak. I hope it helps.
I’ve set ChannelOption.SO_KEEPALIVE to false to make it go away as follows:
Please be aware that you are disabling keep alive connections and it may have and impact on latency. Also check if there is nothing between your client and server that you are trying to hit. I found that any proxy or load balancer in between can complicate your life.