kubernetes: kube-proxy randomly returns 504 gateway timeouts (without actually waiting)
Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): Yes and I have
What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): kube-proxy, gateway timeout, gateway, 504
Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.3", GitCommit:"029c3a408176b55c30846f0faedf56aae5992e9b", GitTreeState:"clean", BuildDate:"2017-02-22T10:12:27Z", GoVersion:"go1.8", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Environment:
- Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release): kope.io/k8s-1.4-debian-jessie-amd64-hvm-ebs-2016-10-21
- Kernel (e.g.
uname -a
): - Install tools: kops
What happened: I have setup a service with LoadBalancer (AWS ELB). The ELB sees all nodes as available. When trying to access that service, I randomly get 504 gateway timeouts that appear instantly, without actually waiting for a timeout. Restarting the kube-proxy does not seem to help at all. Refreshing/re-sending the request will solve it. It seems to happen every few requests as if on a round robin rotation.
What you expected to happen: kube-proxy should be able to handle such things.
How to reproduce it (as minimally and precisely as possible): Not sure here. Try running a kops cluster with an ELB service and a uWSGI web server as a backend
Anything else we need to know: We are using uWSGI as a server inside the containers.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 26 (13 by maintainers)
@ikornaselur I was accidentally using
http-socket
instead ofhttp
(for 3 years now). Also added theharakiri
setting and adjusted threads/processes a bit.One thing to keep in mind with ELBs is that they have HTTP keep alive (called idle timeout) on by default, and its set to 60seconds. That needs to be 1 second less than your apps keep alive. This was the cause of 504s for several of our apps.
http://docs.aws.amazon.com/elasticloadbalancing/latest/classic/config-idle-timeout.html
To configure the ELB idle time-out you can set this annotation of your service.