spring-cloud-netflix: first call to Zuul fails with Connection reset when executed on server
If we run the Zuul proxy for about 15-30 min without making any calls, it will fail the first call with an HTTP 500 error (Connection reset when executed on server )zuu. After that, all subsequent calls work properly.
Config:
hystrix:
command:
default:
execution:
timeout:
enabled: false
ribbon:
ReadTimeout: 10000
Versions:
- Spring boot 1.5.7.Release
- <spring-cloud-services.version>1.5.0.RELEASE</spring-cloud-services.version>
- <spring-cloud.version>Dalston.SR3</spring-cloud.version>
Full log: First call fails and second works https://gist.github.com/vimal-raz/9ddc8113e7513b5ab54d2533b1cad0cb
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] org.apache.http.wire : http-outgoing-30 << "[read] I/O error: Connection reset"
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.DefaultManagedHttpClientConnection : http-outgoing-30: Close connection
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.DefaultManagedHttpClientConnection : http-outgoing-30: Shutdown connection
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] o.a.http.impl.execchain.MainClientExec : Connection discarded
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] h.i.c.PoolingHttpClientConnectionManager : Connection released: [id: 30][route: {s}->https://mbdealerservice.apps.dev.us-east-1.nafta-ow.com:443][total kept alive: 0; route allocated: 0 of 50; total allocated: 0 of 200]
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.871 DEBUG 14 --- [nio-8080-exec-2] c.n.l.reactive.LoadBalancerCommand : Got error java.net.SocketException: Connection reset when executed on server XXXXXXXXXXXXXX:d10d793e-8293-438d-78c4-20360c35c3a5
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT com.netflix.client.ClientException: null
2017-09-27T12:27:36.87-0400 [APP/PROC/WEB/0] OUT 2017-09-27 16:27:36.874 DEBUG 14 --- [nio-8080-exec-2] com.netflix.hystrix.AbstractCommand : Error executing HystrixCommand.run(). Proceeding to fallback logic ...
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 16 (3 by maintainers)
Please learn how to properly format code and logs. What version of spring cloud are you using? See #1334
We have been investigating this issue on our CloudFoundry architecture a bit more and it seems to be an issue with the Http client and the way connections are kept-alive. The issue appears when using either Apache HTTP Client (default) or OkHttp. It does not appear however, when using the deprecated restclient which was previously used as the default Http client. So this is the current fix for us (please note that the client is officially deprecated and according to some of the developers it was deprecated due to some bugs).
The following short paragraph talks about the three clients https://cloud.spring.io/spring-cloud-netflix/multi/multi__router_and_filter_zuul.html#_zuul_http_client
Another fix that could solve (cover-up) the issue would be to provide a custom Http client which disables connection keep-alive.
@spencergibb mentions that:
here: https://github.com/spring-cloud/spring-cloud-netflix/issues/1125
@dersteve PCF team suggested to use Spring retry in boot apps and zuul to handle this issue.
https://github.com/spring-projects/spring-retry https://docs.spring.io/spring-batch/trunk/reference/html/retry.html
I have opened a ticket with cloud foundry team. Will keep this ticket updated. thanks