reactor-netty: java.io.IOException: Connection reset by peer

Actual behavior

@Bean
fun webClient(): WebClient {
    return WebClient
        .builder()
        .baseUrl(someUrl)
        .filter(logResponseStatus())
        .build()
}
@RestController
@RequestMapping("v1")
class OrderController(private val orderService: OrderService) {

    @GetMapping("/orders/{storeId}")
    fun orders(@RequestHeader("Authorization") auth: String,
        @PathVariable("storeId") storeId: String) = mono(Unconfined) {
        orderService.orders(storeId, auth).unpackResponse()
    }
 //Repostiory class
 suspend fun deliveries(storeId: String, auth: String): List<ActiveOrdersRequestResponse>? {
        return webclient
            .get()
            .uri("v1/stores/$storeId/deliveries/$queryString")
            .header("Authorization", auth)
            .retrieve()
            .onStatus({ it == HttpStatus.FORBIDDEN || it == HttpStatus.UNAUTHORIZED }, { Mono.error(ValidationException(ErrorCode.AUTHENTICATION_ERROR, it.statusCode())) })
            .onStatus({ it == HttpStatus.NOT_FOUND }, { Mono.error(ValidationException(ErrorCode.STORE_NOT_FOUND, it.statusCode())) })
            .onStatus({ it.is5xxServerError }, { Mono.error(ValidationException(ErrorCode.AUTHENTICATION_EXCEPTION, it.statusCode())) })
            .bodyToFlux<ActiveOrdersRequestResponse>(ActiveOrdersRequestResponse::class.java)
            .collectList()
            .awaitSingle() //to prevent blocking the main thread
    }

For some reason executing this code throws randomly a

java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
    at sun.nio.ch.IOUtil.read(IOUtil.java:192)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:288)
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1108)
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:345)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886)
    at java.lang.Thread.run(Thread.java:748)

And I get a 500. This seems to happen mostly on my docker container than my main machine. I have tried numerous times to replicate it but no dice. There is no pattern whatsoever.

Load test result bombading my local server:

================================================================================
---- Global Information --------------------------------------------------------
> request count                                       1000 (OK=1000   KO=0     )
> min response time                                    400 (OK=400    KO=-     )
> max response time                                   1707 (OK=1707   KO=-     )
> mean response time                                   824 (OK=824    KO=-     )
> std deviation                                        221 (OK=221    KO=-     )
> response time 50th percentile                        802 (OK=802    KO=-     )
> response time 75th percentile                        975 (OK=975    KO=-     )
> response time 95th percentile                       1219 (OK=1219   KO=-     )
> response time 99th percentile                       1384 (OK=1384   KO=-     )
> mean requests/sec                                333.333 (OK=333.333 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                           496 ( 50%)
> 800 ms < t < 1200 ms                                 446 ( 45%)
> t > 1200 ms                                           58 (  6%)
> failed                                                 0 (  0%)
================================================================================

Result pointing to a remote server on 5 machines with 4gb and 2.5 VCPU

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      10000 (OK=9965   KO=35    )
> min response time                                    105 (OK=1041   KO=105   )
> max response time                                  31431 (OK=31431  KO=28589 )
> mean response time                                  5610 (OK=5618   KO=3425  )
> std deviation                                       3475 (OK=3438   KO=8984  )
> response time 50th percentile                       5452 (OK=5463   KO=198   )
> response time 75th percentile                       8189 (OK=8194   KO=241   )
> response time 95th percentile                       9772 (OK=9769   KO=28406 )
> response time 99th percentile                      10771 (OK=10740  KO=28527 )
> mean requests/sec                                    250 (OK=249.125 KO=0.875 )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                             0 (  0%)
> 800 ms < t < 1200 ms                                  83 (  1%)
> t > 1200 ms                                         9882 ( 99%)
> failed                                                35 (  0%)
---- Errors --------------------------------------------------------------------
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b     35 (100.0%)
ut actually found 500
================================================================================

Steps to reproduce

Random

Reactor Netty version

io.projectreactor.ipc:reactor-netty:0.7.5.RELEASE

JVM version (e.g. java -version)

openjdk version “1.8.0_171” OpenJDK Runtime Environment (IcedTea 3.8.0) (Alpine 8.171.11-r0) OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)

OS version (e.g. uname -a)

Linux machine-name #1 SMP Sun Mar 11 19:39:47 UTC 2018 x86_64 Linux

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 71 (21 by maintainers)

Most upvoted comments

The issue seems to be still present in current versions. Using Spring Boot 2.1.0 -> reactor 3.2.2 -> reactor-netty 0.8.2 -> netty 4.1.29 we intermittently still get:

io.netty.channel.unix.Errors$NativeIoException: syscall:read(..) failed: Connection reset by peer
    at io.netty.channel.unix.FileDescriptor.readAddress(..)

Unfortunately I couldn’t try the workarounds (disabling the pool and switching to nio) as suggested in one of the previous comments by @violetagg since the API changed for the 0.8.2 version and preferNative(boolean preferNative) seems not to exist anymore. Thus any hints on how to switch to nio and disable pooling using the 0.8.2 API would be appreciated.

Issue disappeared after I added keepAlive(false) to HttpClient config:

HttpClient.create()
        .baseUrl("http://localhost:" + port)
        .keepAlive(false)     <---------------
        .headers(headers -> headers.add(HEADER_CONTENT_TYPE, CONTENT_TYPE_APPLICATION_JSON))

Inspired by https://groups.google.com/forum/#!topic/vertx/3o_DEwIK9dY

Hi, Can someone suggest what happened to the original solution against which this bug was created? I am still getting the connection reset by a peer from the client-side when using spring boot started 2.0.3.RELEASE and netty 4.1.25.Final. Did upgrading the dependencies resolve this specific issue?

Hi, who also meet this situation? I have struggled with it more than 1 day and finally I found that grpc library caused the conflict with reactor netty (both 2 framework are in io package). So if you find the grpc in the pom, consider using it beside reactor netty