go: runtime: Kubernetes' kube-proxy stuck in GC forever
Please answer these questions before submitting your issue. Thanks!
What version of Go are you using (go version
)?
go1.7.4 linux/amd64
What operating system and processor architecture are you using (go env
)?
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp.k8s/go-build739048657=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
(from k8s build container)
What did you do?
Running k8s cluster
What did you expect to see?
kube-proxy process running as expected
What did you see instead?
kube-proxy gets stuck in GC code, with no goroutines being scheduled
The problem here is that the problem is quite hard to reproduce, something similar happens on some nodes from time to time, may take some days to run into the problem. We currently have a process that’s in the state described here running on one of the nodes. I’m pretty sure it’s related to Go runtime but in any case I’m stuck trying to find a way to debug it. Any hints on what needs to be done to find out the cause of the problem would be very appreciated. I don’t want to kill the process with SIGQUIT to retrieve goroutine info, so doing this with delve instead.
The process is running in docker container (debian jessie), GOMAXPROCS=48
Goroutine info: https://gist.github.com/ivan4th/17654f6fee35a38548502de4b6f68ce4 Thread info: https://gist.github.com/ivan4th/4596664ba1f935c500250f74ade5c162 Log output from another hung kube-proxy killed with SIGQUIT: https://gist.github.com/ivan4th/5da95ebf8986c6834bca35c9b4e7895b
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 32 (16 by maintainers)
We had the same issue with
kube-proxy
at least two times.