kubernetes: kubeadm init "[kubelet-check] Initial timeout of 40s passed"

What happened:

kubeadm init timed out while waiting for kubelet to start although kubelet started successfully (perhaps after a longer duration). Looks like the timeout duration is hard-coded.

if err := waiter.WaitForHealthyKubelet(40*time.Second, fmt.Sprintf("http://localhost:%d/healthz", kubeadmconstants.KubeletHealthzPort));

From https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/util/apiclient/wait.go#L164

# timeout -k 600s 600s kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
        [WARNING Port-6443]: Port 6443 is in use
        [WARNING Port-10251]: Port 10251 is in use
        [WARNING Port-10252]: Port 10252 is in use
        [WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
        [WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
        [WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
        [WARNING HTTPProxyCIDR]: connection to "10.233.0.0/18" uses proxy "http://10.192.126.40:8080". This may lead to malfunctional cluster setup. Make sure that Pod and Services IP ranges specified correctly as exceptions in proxy configuration
        [WARNING HTTPProxyCIDR]: connection to "10.233.64.0/18" uses proxy "http://10.192.126.40:8080". This may lead to malfunctional cluster setup. Make sure that Pod and Services IP ranges specified correctly as exceptions in proxy configuration
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.09.2. Latest validated version: 18.06
        [WARNING Port-10250]: Port 10250 is in use
        [WARNING ExternalEtcdVersion]: Get https://10.198.110.187:2379/version: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
        [WARNING ExternalEtcdVersion]: Get https://10.198.110.27:2379/version: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
        [WARNING ExternalEtcdVersion]: Get https://10.198.111.115:2379/version: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/ssl"
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate authority generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation
[certs] External etcd mode: Skipping etcd/peer certificate authority generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation
[certs] Using existing ca certificate authority
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/scheduler.conf"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-tls" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-ca-trust" to "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[controlplane] Adding extra host path mount "etc-pki-tls" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-ca-trust" to "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[controlplane] Adding extra host path mount "etc-pki-tls" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-ca-trust" to "kube-apiserver"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 5m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

What you expected to happen:

kubeadm detect that kubelet started successfully and move on with the initialisation.

How to reproduce it (as minimally and precisely as possible):

kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all

Anything else we need to know?:

This kubeadm command is being called from a kubespray Ansible playbook.

Environment:

Kubernetes version (use kubectl version): v1.13.1
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):

NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"

Kernel (e.g. uname -a):

Linux ip-10-198-110-225.cloud.dev.net 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 15 17:36:42 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux```

- Install tools:
- Others:

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 18
Comments: 17 (4 by maintainers)

Most upvoted comments

in my case solved by disabling swap partition

swapoff -a vim /etc/fstab

and delete .kube dir

rm -r ~/.kube

YassineHk on Jun 3, 2020

Getting the same issue on version 1.14.0

First, I follow the document CRI installation, and set native.cgroupdriver=systemd before start Docker, then get the same issue.

But when I re-launch a new instance and delete native.cgroupdriver=systemd, it’s okay to run kubeadm init and the error is disappeared.

Environment

Kubernetes version (use kubectl version): 1.14.0
Cloud provider or hardware configuration: AWS EC2 t2.medium
OS (e.g. from /etc/os-release): Ubuntu 18.04.2 LTS
Kernel (e.g. uname -a): Linux ip-xx-xx-xx-xx 4.15.0-1032-aws #34-Ubuntu SMP Thu Jan 17 15:18:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

tangrwx on May 29, 2019

Same problem

Modified init-config.yaml and solved

advertiseAddress: 1.2.3.4 to advertiseAddress: 10.0.128.0 10.0.128.0 is k8s-master ip address

Environment

Kubernetes version: v1.19.0
OS: CentOS 7.9
Kernel: 3.10.0-1062.el7.x86_64
Docker Version: docker-ce 19.03.9

Reference

https://q.cnblogs.com/q/124859/ 博客：https://blog.csdn.net/weixin_44789466/article/details/119046245

ujslxw on Jul 23, 2021