spring-cloud-netflix: first call to Zuul fails with Connection reset when executed on server

If we run the Zuul proxy for about 15-30 min without making any calls, it will fail the first call with an HTTP 500 error (Connection reset when executed on server )zuu. After that, all subsequent calls work properly.

Config:

hystrix:
  command:
    default:
      execution:
        timeout:
          enabled: false

ribbon:
  ReadTimeout: 10000

Versions:

  • Spring boot 1.5.7.Release
  • <spring-cloud-services.version>1.5.0.RELEASE</spring-cloud-services.version>
  • <spring-cloud.version>Dalston.SR3</spring-cloud.version>

Full log: First call fails and second works https://gist.github.com/vimal-raz/9ddc8113e7513b5ab54d2533b1cad0cb

   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] org.apache.http.wire                     : http-outgoing-30 << "[read] I/O error: Connection reset"
 
   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.DefaultManagedHttpClientConnection : http-outgoing-30: Close connection
   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.DefaultManagedHttpClientConnection : http-outgoing-30: Shutdown connection
   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] o.a.http.impl.execchain.MainClientExec   : Connection discarded
   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection released: [id: 30][route: {s}->https://mbdealerservice.apps.dev.us-east-1.nafta-ow.com:443][total kept alive: 0; route allocated: 0 of 50; total allocated: 0 of 200]

  2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] c.n.l.reactive.LoadBalancerCommand       : Got error java.net.SocketException: Connection reset when executed on server XXXXXXXXXXXXXX:d10d793e-8293-438d-78c4-20360c35c3a5
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT com.netflix.client.ClientException: null
   2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.874 DEBUG 14 --- [nio-8080-exec-2] com.netflix.hystrix.AbstractCommand      : Error executing HystrixCommand.run(). Proceeding to fallback logic ...

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 16 (3 by maintainers)

Most upvoted comments

Please learn how to properly format code and logs. What version of spring cloud are you using? See #1334

We have been investigating this issue on our CloudFoundry architecture a bit more and it seems to be an issue with the Http client and the way connections are kept-alive. The issue appears when using either Apache HTTP Client (default) or OkHttp. It does not appear however, when using the deprecated restclient which was previously used as the default Http client. So this is the current fix for us (please note that the client is officially deprecated and according to some of the developers it was deprecated due to some bugs).

The following short paragraph talks about the three clients https://cloud.spring.io/spring-cloud-netflix/multi/multi__router_and_filter_zuul.html#_zuul_http_client

Another fix that could solve (cover-up) the issue would be to provide a custom Http client which disables connection keep-alive.

@spencergibb mentions that:

RestClient has limitations like not supporting PATCH and other bugs that are fixed with Apache

here: https://github.com/spring-cloud/spring-cloud-netflix/issues/1125

@dersteve PCF team suggested to use Spring retry in boot apps and zuul to handle this issue.

https://github.com/spring-projects/spring-retry https://docs.spring.io/spring-batch/trunk/reference/html/retry.html

I have opened a ticket with cloud foundry team. Will keep this ticket updated. thanks