kubernetes: Dockershim streaming server conflicts with NodePort

What happened: kubectl exec fails or times out:

Connection refused:

kubectl exec -it -n my-ns my-pod sh
error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:37751: connect: connection refused

Timout:

kubectl -n my-ns exec -it my-pod -v 99 bash
...
I1118 10:11:32.945560    9992 round_trippers.go:419] curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "
X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl.exe/v1.15.3 (windows/amd64) kubernetes/2d3c76f" -H "Authorization: Bearer kubeconfig-u-ob5wqxfcaq:fc5gvmsxt2j5z8s227gk5t9v7f5rc9hlc7fqpxv8tnm56g8lbjnws2" 'http
s://api-sever.mycorp.com/k8s/clusters/c-9skrw/api/v1/namespaces/my-ns/pods/my-pod/exec?command=bash&container=minio&stdin=true&stdout=true&tty=true'
I1118 10:13:43.699847    9992 round_trippers.go:438] POST https://api-sever.mycorp.com/k8s/clusters/c-9skrw/api/v1/namespaces/my-ns/pods/my-pod/exec?command=bash&container=minio&stdin=true&stdout
=true&tty=true 500 Internal Server Error in 130752 milliseconds
I1118 10:13:43.712034    9992 round_trippers.go:444] Response Headers:
I1118 10:13:43.712034    9992 round_trippers.go:447]     Server: openresty/1.15.8.1
I1118 10:13:43.712034    9992 round_trippers.go:447]     Date: Mon, 18 Nov 2019 09:13:43 GMT
I1118 10:13:43.713032    9992 round_trippers.go:447]     Content-Type: text/plain; charset=utf-8
I1118 10:13:43.713032    9992 round_trippers.go:447]     Content-Length: 79
I1118 10:13:43.713032    9992 round_trippers.go:447]     Connection: keep-alive
I1118 10:13:43.714030    9992 round_trippers.go:447]     X-Content-Type-Options: nosniff
I1118 10:13:43.714030    9992 round_trippers.go:447]     Strict-Transport-Security: max-age=15724800; includeSubDomains
F1118 10:13:43.717022    9992 helpers.go:114] error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:32935: connect: connection timed out

What you expected to happen: kubectl exec works…

How to reproduce it (as minimally and precisely as possible):

  • kube-proxy runs in ipvs mode
  • api-server config: service-node-port-range: 30000-39999
  • kubelet starts the docker shim streaming server in the NodePort range (here 127.0.0.1:32935):
root@kubedev-worker-8b005e396435:~# netstat -anp | grep kubelet
tcp        0      0 127.0.0.1:32935         0.0.0.0:*               LISTEN      2419/kubelet
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      2419/kubelet
tcp        0      0 0.0.0.0:10250           0.0.0.0:*               LISTEN      2419/kubelet
  • create a service with the same NodePort the streaming server uses (here 32935)
  • Wait a few seconds so kube-proxy syncs then try to run kubectl exec with a pod on that node (kubedev-worker-8b005e396435).
  • Error: error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:32935: connect: connection refused

Anything else we need to know?: I’m not sure how to reproduce the kubectl exec connection timed out problem but I observed that the kubelet streaming server was using a existing NodePort in this case as well. Maybe this happens after a reboot when the kubelet starts before the kube-proxy and the kubelet uses a NodePort that is already used… Seems to me that the streaming server uses a random port that doesn’t take into account the NodePort Range: https://github.com/kubernetes/kubernetes/blob/4c50ee993c82c6852eb3b3aa8dfa8ecc4bcfe330/pkg/kubelet/kubelet.go#L2294 Maybe an option to specify the streaming server port would fix it?

Environment:

  • Kubernetes version (use kubectl version): v1.15.5 kubelet and api-server
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release): Ubuntu 18.04.3 LTS
  • Kernel (e.g. uname -a): 5.0.0-31-generic
  • Install tools: kubeadm, kubectl
  • Network plugin and version (if this is a network-related bug): canal with calico v3.10
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (9 by maintainers)

Most upvoted comments

For anyone else stumbling upon this issue with the error error: unable to upgrade connection: error dialing backend: dial tcp 127.0.0.1:<port>, the problem for me was that the loopback device was never started (ifconfig lo). Simply running ifup lo fixed this issue for me.

Had the same problem. In the cluster of someone cloud vendor, which set apiserver with option --service-node-port-range=30000-50000, the streaming server startup with port 32859, it conflicted with nodeport of one service.

AFAIK, the ‘redirect-container-streaming’ options will disable streaming server, but it had been removed from v1.20. So, in the case, set option service-node-port-range with 30000-50000, the probability of confliction will be increases, with the number of nodeport increases. @gongguan

If i read it right,issue https://github.com/kubernetes/kubernetes/issues/100643 is disccussing it.