kubernetes: kubelet start fail for Streaming server stopped unexpected

Environment:

Kubernetes version (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.4”, GitCommit:“a87e9a978f65a8303aa9467537aa59c18122cbf9”, GitTreeState:“clean”, BuildDate:“2019-07-08T08:51:16Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.4”, GitCommit:“a87e9a978f65a8303aa9467537aa59c18122cbf9”, GitTreeState:“clean”, BuildDate:“2019-07-08T08:43:10Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”}

OS (e.g. from /etc/os-release): NAME=“CentOS Linux” VERSION=“7 (Core)” ID=“centos” ID_LIKE=“rhel fedora” VERSION_ID=“7” PRETTY_NAME=“CentOS Linux 7 (Core)” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:centos:centos:7” HOME_URL=“https://www.centos.org/” BUG_REPORT_URL=“https://bugs.centos.org/

Kernel (e.g. uname -a): Linux bd01 4.4.185-1.el7.elrepo.x86_64 #1 SMP Wed Jul 10 08:46:17 EDT 2019 x86_64 x86_64 x86_64 GNU/Linux

Others: other service etcd,flannel,kube-api-server,kube-controller-mannager,kube-scheduler all run normally. my problem is i have 3 nodes,all these three node are used as master,worker,etcd role; tow of these can run kubelet properlly,one can not; the error journalctl log is:

7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344056   21912 container_manager_linux.go:286] Creating device plugin mana
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344063   21912 manager.go:109] Creating Device Plugin manager at /var/lib/
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344082   21912 state_mem.go:36] [cpumanager] initializing new in-memory st
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344246   21912 state_mem.go:84] [cpumanager] updated default cpuset: ""
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344254   21912 state_mem.go:92] [cpumanager] updated cpuset assignments: "
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344262   21912 state_checkpoint.go:100] [cpumanager] state checkpoint: res
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344266   21912 state_checkpoint.go:101] [cpumanager] state checkpoint: def
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344301   21912 server.go:997] Using root directory: /data/k8s/k8s/kubelet
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.344314   21912 kubelet.go:304] Watching apiserver
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.346407   21912 client.go:75] Connecting to docker on unix:///var/run/docke
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.346428   21912 client.go:104] Start docker client with request timeout=10m
7月 18 08:45:27 bd02 kubelet[21912]: W0718 08:45:27.347736   21912 docker_service.go:561] Hairpin mode set to "promiscuous-bri
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.347751   21912 docker_service.go:238] Hairpin mode set to "hairpin-veth"
7月 18 08:45:27 bd02 kubelet[21912]: W0718 08:45:27.347842   21912 cni.go:213] Unable to update cni config: No networks found
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.349589   21912 docker_service.go:253] Docker cri networking managed by kub
7月 18 08:45:27 bd02 kubelet[21912]: E0718 08:45:27.354624   21912 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:
7月 18 08:45:27 bd02 kubelet[21912]: E0718 08:45:27.354625   21912 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apis
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.354918   21912 docker_service.go:258] Docker Info: &{ID:CBLH:B63G:MQE2:L3A
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.354968   21912 docker_service.go:271] Setting cgroupDriver to cgroupfs
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.355019   21912 kubelet.go:632] Starting the GRPC server for the docker CRI
7月 18 08:45:27 bd02 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
7月 18 08:45:27 bd02 kubelet[21912]: I0718 08:45:27.355037   21912 docker_server.go:59] Start dockershim grpc server
7月 18 08:45:27 bd02 kubelet[21912]: F0718 08:45:27.355196   21912 docker_service.go:412] Streaming server stopped unexpectedl
7月 18 08:45:27 bd02 systemd[1]: Unit kubelet.service entered failed state.
7月 18 08:45:27 bd02 systemd[1]: kubelet.service failed.
7月 18 08:45:32 bd02 systemd[1]: kubelet.service holdoff time over, scheduling restart.
7月 18 08:45:32 bd02 systemd[1]: Started Kubernetes Kubelet.

another node which can run kubelet correct,it jouralctl log is :

7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728635    8565 container_manager_linux.go:286] Creating device plugin manag
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728643    8565 manager.go:109] Creating Device Plugin manager at /var/lib/k
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728662    8565 state_mem.go:36] [cpumanager] initializing new in-memory sta
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728745    8565 state_mem.go:84] [cpumanager] updated default cpuset: ""
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728753    8565 state_mem.go:92] [cpumanager] updated cpuset assignments: "m
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728760    8565 state_checkpoint.go:100] [cpumanager] state checkpoint: rest
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728764    8565 state_checkpoint.go:101] [cpumanager] state checkpoint: defa
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728796    8565 server.go:997] Using root directory: /data/k8s/k8s/kubelet
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.728807    8565 kubelet.go:304] Watching apiserver
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.730032    8565 client.go:75] Connecting to docker on unix:///var/run/docker
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.730042    8565 client.go:104] Start docker client with request timeout=10m0
7月 17 10:37:05 bd03 kubelet[8565]: W0717 10:37:05.730872    8565 docker_service.go:561] Hairpin mode set to "promiscuous-brid
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.730881    8565 docker_service.go:238] Hairpin mode set to "hairpin-veth"
7月 17 10:37:05 bd03 kubelet[8565]: W0717 10:37:05.730945    8565 cni.go:213] Unable to update cni config: No networks found i
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.732613    8565 docker_service.go:253] Docker cri networking managed by kube
7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.733535    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:4
7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.734019    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:4
7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.734096    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apise
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.736683    8565 docker_service.go:258] Docker Info: &{ID:WOZU:F6UW:H6C4:4SFS
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.736734    8565 docker_service.go:271] Setting cgroupDriver to cgroupfs
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.736796    8565 kubelet.go:632] Starting the GRPC server for the docker CRI
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.736813    8565 docker_server.go:59] Start dockershim grpc server
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.740656    8565 remote_runtime.go:62] parsed scheme: ""
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.740667    8565 remote_runtime.go:62] scheme "" not registered, fallback to
7月 17 10:37:05 bd03 kubelet[8565]: I0717 10:37:05.740683    8565 remote_image.go:50] parsed scheme: ""

compare to correct jouralctl log,the difference is :

7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.733535    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:4
7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.734019    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:4
7月 17 10:37:05 bd03 kubelet[8565]: E0717 10:37:05.734096    8565 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apise

and /var/log/messages error i found is below:

Jul 18 08:45:37 bd02 kubelet: I0718 08:45:37.720720   22039 docker_service.go:271] Setting cgroupDriver to cgroupfs
Jul 18 08:45:37 bd02 kubelet: I0718 08:45:37.720778   22039 kubelet.go:632] Starting the GRPC server for the docker CRI shim.
Jul 18 08:45:37 bd02 systemd: kubelet.service failed.
Jul 18 08:45:37 bd02 kubelet: I0718 08:45:37.720795   22039 docker_server.go:59] Start dockershim grpc server
Jul 18 08:45:37 bd02 kubelet: F0718 08:45:37.720999   22039 docker_service.go:412] Streaming server stopped unexpectedly: listen tcp [::1]:0: bind: cannot assign requested address
Jul 18 08:45:37 bd02 etcd: request "header:<ID:15954401848648262393 username:\"kubernetes\" auth_revision:1 > txn:<compare:<target:MOD key:\"/registry/certificatesigningrequests/csr-cjqfh\" mod_revision:397146 > success:<request_put:<key:\"/registry/certificatesigningrequests/csr-cjqfh\" value_size:1854 >> failure:<>>" with result "size:18" took too long (116.497963ms) to execute
Jul 18 08:45:39 bd02 systemd: Stopped Kubernetes Kubelet.
Jul 18 08:45:45 bd02 etcd: request "header:<ID:15954401848648262466 username:\"kubernetes\" auth_revision:1 > txn:<compare:<target:MOD key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" mod_revision:397156 > success:<request_put:<key:\"/registry/services/endpoints/kube-system/kube-controller-manager\" value_size:357 >> failure:<>>" with result "size:18" took too long (208.730124ms) to execute
Jul 18 08:45:45 bd02 etcd: read-only range request "key:\"/registry/services/specs/default/kubernetes\" " with result "range_response_count:1 size:295" took too long (887.437011ms) to execute

i have spended tow day to find out what happens,but got no result. so can someone help me? thks. /sig network /sig cluster-lifecycle

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (12 by maintainers)

Most upvoted comments

if you have an IPv4 only system and it resolves to an IPv6 address: is not a bug, is a node misconfiguration

Great, we agree on that.

you can assign the string directly config.Addr = “localhost:0”

I agree that could be hardcoded. Use of that function is sort of a pattern - when I see it I trust that it is not going to mis-cast an int into a string or anything dumb. So I don’t object to using it.

This makes me think that maybe the intention was to configure by default config.Addr to the IP address of localhost

I do not think that was the intention. Now that the issue is raised, though, I admit that I do not know what the common usage pattern is for things that bind and listen to localhost in the face of dual-stack. Is it a well-known pattern to try to do both? I have never seen that myself, but that doesn’t mean much…

think that an improvement to this part can be trying to resolve localhost first, so we don´t depend on the resolver, and use the net.JoinHostPort() func to create a correct URL in case it resolves to an IPv6 address (forgive me if the code is not correct is just to express the idea)

How does calling net.LookupHost() not depend on the resolver? If we want to not depend on family, we’d have to open a listener of every value returned by that. As I said, I have never seen that pattern, but maybe that is correct? If so, we need to audit every binary and treat “localhost” as special.

My inclination is to close this and the PR “fixing” it, and if we want to consider localhost multi-bind, open a new KEP for it with supporting evidence that this is the right thing to do. Mostly, nobody is complaining about this (well, 1 example which is not reproducible), so I am disinclined to try to fix it…