grpc: if no request for a long time, server died?
Hi
I run a server and found if no request for a long time, maybe 24 hours, then If I connect to server with same channel, will be a network error.
We use haproxy as load balance. How could client check health of the connection/channel? then how to fix channel and reconnect (with Python code and Java code)?
Thanks so much.
The error log in Java side below:
2016-02-27 15:24:29.786 [grpc-default-worker-ELG-3] ERROR io.netty.handler.codec.http2.Http2ConnectionHandler.error:181 - Sending GOAWAY failed: lastStreamId '0', errorCode '2', debugData 'connection timed out: internal-dev-internel-proxy-1365586739.cn-north-1.elb.amazonaws.com.cn/10.1.6.9:50052'. Forcing shutdown of the connection.
java.nio.channels.ClosedChannelException: null
2016-02-27 15:24:29.788 [http-bio-8080-exec-5] ERROR com.vipkid.security.impl.AuthorizeServiceImpl.getRoleList:55 - Server Error,User Token = 236d50b3-ea62-4174-ad65-c849166d14fa ,e={}
io.grpc.StatusRuntimeException: UNAVAILABLE
at io.grpc.Status.asRuntimeException(Status.java:431) ~[grpc-core-0.13.1.jar:0.13.1]
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:208) ~[grpc-stub-0.13.1.jar:0.13.1]
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:141) ~[grpc-stub-0.13.1.jar:0.13.1]
at com.vipkid.proto.service.AuthServiceGrpc$AuthServiceBlockingStub.getRoleList(AuthServiceGrpc.java:235) ~[auth-proto-1.1-20160226.061926-2.jar:na]
at com.vipkid.client.service.AuthServiceClient.getRoleList(AuthServiceClient.java:68) ~[auth-client-1.1-20160226.071118-3.jar:na]
at com.vipkid.security.impl.AuthorizeServiceImpl.getRoleList(AuthorizeServiceImpl.java:52) ~[AuthorizeServiceImpl.class:na]
at sun.reflect.GeneratedMethodAccessor1046.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_65]
at java.lang.reflect.Method.invoke(Method.java:497) ~[na:1.8.0_65]
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:190) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(TransactionInterceptor.java:98) [spring-tx-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:262) [spring-tx-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:95) [spring-tx-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:207) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at com.sun.proxy.$Proxy35.getRoleList(Unknown Source) [na:na]
at com.vipkid.service.StaffAuthService.login(StaffAuthService.java:64) [StaffAuthService.class:na]
at com.vipkid.service.StaffAuthService$$FastClassBySpringCGLIB$$f265537e.invoke(<generated>) [spring-core-4.0.6.RELEASE.jar:na]
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) [spring-core-4.0.6.RELEASE.jar:4.0.6.RELEASE]
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:708) [spring-aop-4.0.6.RELEASE.jar:4.0.6.RELEASE]
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 18 (7 by maintainers)
I am using Python server implemented and Java Client.
It’s really easy to find where the problem is. If I run the client and do nothing, then in 30s or 1s which depends on AWS ELB setting, Client will get the message (Channel has beed shutdown/terminated by another endpoint) in log.
If we try to reuse channel this time, I will get the UNAVAILABLE.
In Java Client, I have to close channel and recreate a new channel object. Looks like it won’t reconnect.
But…
I have a Python client and Python server implemented, same on AWS ELB, if I have a IDEL connection, the first try I will get UNAVAILABLE also, but the second try will be ok. Just like what @giladwolff said.
And Yes. This is a client-side problem, but not that friendly.
Thanks and sorry for poor English. 😃
2016-05-26 8:29 GMT+08:00 Gilad Wolff notifications@github.com:
http://www.guojing.me