grpc-go: backoff/timeout does not take into account actual GRPC connection established, causes denial of service.
When I connect a GRPC client to an IP and port number which has no listeners the client properly tries to reconnect and respects the dial timeout and fallback.
However if I run an socat -v TCP-LISTEN:8002,fork SYSTEM:‘echo hi’ and then attempt to connect my grpc-go client to it… it’ll keep reconnecting at a fast rate in what seems an infinite loop, effectively DoS’ing the host its running on since TCP connections stay around with TIME_WAIT for tcp_fin_timeout duration.
WARN[0008] transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.
WARN[0008] transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.
WARN[0008] transport: http2Client.notifyError got notified that the client transport was broken unexpected EOF.
[infinite]
This can cause issues when using a TCP load balancer like ELB which always has a TCP listen port open regardless of whether there are any backend instances available as the connection is immediately closed after an accept.
Even in the case where someone isn’t running a load balancer, if for any reason the server it tries to connect misbehaves it will cause the above.
Note: I’m using grpc.WithTimeout and grpc.WithBlock.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 7
- Comments: 23 (7 by maintainers)
+1 this is still an issue. I have a very basic use case of gRPC and the log output is flooded with this message every 4 minutes while it is sitting idle. This looks confusing for someone who just started using gRPC and therefore warrants a Google search which eventually lands people here. I am still wondering if I’m doing something wrong.
I have dug a bit into this, it turns out that the tight-looping (DoS-ing) behaviour is to be found in the
transportMonitor()
goroutine, which does not take in account an EOF-serving service.Steps to reproduce the problem:
Ctrl+C
)socat -v TCP-LISTEN:8081,fork SYSTEM:'echo hi'
The client will generate an avalanche of reconnections, quickly exhausting all file descriptors.
I’m getting these logs when using the gcloud vision client (cloud.google.com/go/vision)
Let me know if you need a repro program. I am assuming the fix will get rid of this error message without any change in the caller code. If that’s not the case, we’re not fixing it right.
I am also getting these logs when using Google’s Cloud Speech and Cloud Translate APIs.
Is there an update on this?
@MakMukhi Do you have an ETA and/or gRPC version number for when the log spam will cease?