kind: [cAdvisor] cluster creation with v0.8.x and Kubernetes built from source fails on some hosts
What happened:
I have cloned the Kubernetes repo on my dev machine (MAC) at $(go env GOPATH)/src/k8s.io/kubernetes
.
I successfully ran kind build node-image
, which picked the latest Kubernetes master branch commit (0a6c826d3e92dae8f20d6199d0ac7deeca9eed71).
Then I ran kind create cluster --image kindest/node:latest
, and got:
Creating cluster “kind” … ✓ Ensuring node image (kindest/node:latest) 🖼 ✓ Preparing nodes 📦
✓ Writing configuration 📜 ✗ Starting control-plane 🕹️ ERROR: failed to create cluster: failed to init node with kubeadm: command “docker exec --privileged kind-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6” failed with error: exit status 1 Command Output: I0506 16:54:03.054571 166 initconfiguration.go:200] loading configuration from “/kind/kubeadm.conf” [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta2, Kind=JoinConfiguration I0506 16:54:03.061537 166 interface.go:400] Looking for default routes with IPv4 addresses I0506 16:54:03.061671 166 interface.go:405] Default route transits interface “eth0” I0506 16:54:03.061894 166 interface.go:208] Interface eth0 is up I0506 16:54:03.062666 166 interface.go:256] Interface “eth0” has 3 addresses :[172.19.0.2/16 fc00:f853:ccd:e793::2/64 fe80::42:acff:fe13:2/64]. I0506 16:54:03.063309 166 interface.go:223] Checking addr 172.19.0.2/16. I0506 16:54:03.063412 166 interface.go:230] IP found 172.19.0.2 I0506 16:54:03.063484 166 interface.go:262] Found valid IPv4 address 172.19.0.2 for interface “eth0”. I0506 16:54:03.063579 166 interface.go:411] Found active IP 172.19.0.2 W0506 16:54:03.071914 166 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.19.0-alpha.3.33+0a6c826d3e92da [preflight] Running pre-flight checks I0506 16:54:03.072688 166 checks.go:577] validating Kubernetes and kubeadm version I0506 16:54:03.072964 166 checks.go:166] validating if the firewall is enabled and active I0506 16:54:03.082840 166 checks.go:201] validating availability of port 6443 I0506 16:54:03.083497 166 checks.go:201] validating availability of port 10259 I0506 16:54:03.083621 166 checks.go:201] validating availability of port 10257 I0506 16:54:03.083786 166 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml I0506 16:54:03.084065 166 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml I0506 16:54:03.084377 166 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml I0506 16:54:03.084626 166 checks.go:286] validating the existence of file /etc/kubernetes/manifests/etcd.yaml I0506 16:54:03.084766 166 checks.go:432] validating if the connectivity type is via proxy or direct I0506 16:54:03.085139 166 checks.go:471] validating http connectivity to first IP address in the CIDR I0506 16:54:03.085433 166 checks.go:471] validating http connectivity to first IP address in the CIDR I0506 16:54:03.085569 166 checks.go:102] validating the container runtime I0506 16:54:03.087021 166 checks.go:376] validating the presence of executable crictl I0506 16:54:03.087156 166 checks.go:335] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables [WARNING FileContent–proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist I0506 16:54:03.087865 166 checks.go:335] validating the contents of file /proc/sys/net/ipv4/ip_forward I0506 16:54:03.088065 166 checks.go:649] validating whether swap is enabled or not [WARNING Swap]: running with swap on is not supported. Please disable swap I0506 16:54:03.088756 166 checks.go:376] validating the presence of executable conntrack I0506 16:54:03.089310 166 checks.go:376] validating the presence of executable ip I0506 16:54:03.089447 166 checks.go:376] validating the presence of executable iptables I0506 16:54:03.089925 166 checks.go:376] validating the presence of executable mount I0506 16:54:03.090039 166 checks.go:376] validating the presence of executable nsenter I0506 16:54:03.090240 166 checks.go:376] validating the presence of executable ebtables I0506 16:54:03.090429 166 checks.go:376] validating the presence of executable ethtool I0506 16:54:03.090726 166 checks.go:376] validating the presence of executable socat I0506 16:54:03.090832 166 checks.go:376] validating the presence of executable tc I0506 16:54:03.091171 166 checks.go:376] validating the presence of executable touch I0506 16:54:03.091303 166 checks.go:520] running all checks I0506 16:54:03.099470 166 checks.go:406] checking whether the given node name is reachable using net.LookupHost I0506 16:54:03.103053 166 checks.go:618] validating kubelet version I0506 16:54:03.180399 166 checks.go:128] validating if the “kubelet” service is enabled and active I0506 16:54:03.191708 166 checks.go:201] validating availability of port 10250 I0506 16:54:03.191805 166 checks.go:201] validating availability of port 2379 I0506 16:54:03.191844 166 checks.go:201] validating availability of port 2380 I0506 16:54:03.191909 166 checks.go:249] validating the existence and emptiness of directory /var/lib/etcd [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’ I0506 16:54:03.205613 166 checks.go:839] image exists: k8s.gcr.io/kube-apiserver:v1.19.0-alpha.3.33_0a6c826d3e92da I0506 16:54:03.216818 166 checks.go:839] image exists: k8s.gcr.io/kube-controller-manager:v1.19.0-alpha.3.33_0a6c826d3e92da I0506 16:54:03.226920 166 checks.go:839] image exists: k8s.gcr.io/kube-scheduler:v1.19.0-alpha.3.33_0a6c826d3e92da I0506 16:54:03.236217 166 checks.go:839] image exists: k8s.gcr.io/kube-proxy:v1.19.0-alpha.3.33_0a6c826d3e92da I0506 16:54:03.246675 166 checks.go:839] image exists: k8s.gcr.io/pause:3.2 I0506 16:54:03.256707 166 checks.go:839] image exists: k8s.gcr.io/etcd:3.4.7-0 I0506 16:54:03.266186 166 checks.go:839] image exists: k8s.gcr.io/coredns:1.6.7 I0506 16:54:03.266250 166 kubelet.go:64] Stopping the kubelet [kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env” [kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml” [kubelet-start] Starting the kubelet [certs] Using certificateDir folder “/etc/kubernetes/pki” I0506 16:54:03.350199 166 certs.go:103] creating a new certificate authority for ca [certs] Generating “ca” certificate and key [certs] Generating “apiserver” certificate and key [certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local kind-control-plane localhost] and IPs [10.96.0.1 172.19.0.2 127.0.0.1] [certs] Generating “apiserver-kubelet-client” certificate and key I0506 16:54:04.533981 166 certs.go:103] creating a new certificate authority for front-proxy-ca [certs] Generating “front-proxy-ca” certificate and key [certs] Generating “front-proxy-client” certificate and key I0506 16:54:04.814400 166 certs.go:103] creating a new certificate authority for etcd-ca [certs] Generating “etcd/ca” certificate and key [certs] Generating “etcd/server” certificate and key [certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.19.0.2 127.0.0.1 ::1] [certs] Generating “etcd/peer” certificate and key [certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.19.0.2 127.0.0.1 ::1] [certs] Generating “etcd/healthcheck-client” certificate and key [certs] Generating “apiserver-etcd-client” certificate and key I0506 16:54:05.662998 166 certs.go:69] creating new public/private key files for signing service account users [certs] Generating “sa” key and public key I0506 16:54:06.241602 166 kubeconfig.go:79] creating kubeconfig file for admin.conf [kubeconfig] Using kubeconfig folder “/etc/kubernetes” [kubeconfig] Writing “admin.conf” kubeconfig file I0506 16:54:06.834313 166 kubeconfig.go:79] creating kubeconfig file for kubelet.conf [kubeconfig] Writing “kubelet.conf” kubeconfig file I0506 16:54:06.984831 166 kubeconfig.go:79] creating kubeconfig file for controller-manager.conf [kubeconfig] Writing “controller-manager.conf” kubeconfig file I0506 16:54:07.340111 166 kubeconfig.go:79] creating kubeconfig file for scheduler.conf [kubeconfig] Writing “scheduler.conf” kubeconfig file [control-plane] Using manifest folder “/etc/kubernetes/manifests” [control-plane] Creating static Pod manifest for “kube-apiserver” I0506 16:54:07.427816 166 manifests.go:91] [control-plane] getting StaticPodSpecs I0506 16:54:07.428480 166 manifests.go:104] [control-plane] adding volume “ca-certs” for component “kube-apiserver” I0506 16:54:07.428525 166 manifests.go:104] [control-plane] adding volume “etc-ca-certificates” for component “kube-apiserver” I0506 16:54:07.428546 166 manifests.go:104] [control-plane] adding volume “k8s-certs” for component “kube-apiserver” I0506 16:54:07.428562 166 manifests.go:104] [control-plane] adding volume “usr-local-share-ca-certificates” for component “kube-apiserver” I0506 16:54:07.428583 166 manifests.go:104] [control-plane] adding volume “usr-share-ca-certificates” for component “kube-apiserver” I0506 16:54:07.435072 166 manifests.go:121] [control-plane] wrote static Pod manifest for component “kube-apiserver” to “/etc/kubernetes/manifests/kube-apiserver.yaml” I0506 16:54:07.435127 166 manifests.go:91] [control-plane] getting StaticPodSpecs [control-plane] Creating static Pod manifest for “kube-controller-manager” I0506 16:54:07.435495 166 manifests.go:104] [control-plane] adding volume “ca-certs” for component “kube-controller-manager” I0506 16:54:07.435547 166 manifests.go:104] [control-plane] adding volume “etc-ca-certificates” for component “kube-controller-manager” I0506 16:54:07.435567 166 manifests.go:104] [control-plane] adding volume “flexvolume-dir” for component “kube-controller-manager” I0506 16:54:07.435589 166 manifests.go:104] [control-plane] adding volume “k8s-certs” for component “kube-controller-manager” I0506 16:54:07.435748 166 manifests.go:104] [control-plane] adding volume “kubeconfig” for component “kube-controller-manager” I0506 16:54:07.435764 166 manifests.go:104] [control-plane] adding volume “usr-local-share-ca-certificates” for component “kube-controller-manager” I0506 16:54:07.435788 166 manifests.go:104] [control-plane] adding volume “usr-share-ca-certificates” for component “kube-controller-manager” [control-plane] Creating static Pod manifest for “kube-scheduler” I0506 16:54:07.436691 166 manifests.go:121] [control-plane] wrote static Pod manifest for component “kube-controller-manager” to “/etc/kubernetes/manifests/kube-controller-manager.yaml” I0506 16:54:07.436792 166 manifests.go:91] [control-plane] getting StaticPodSpecs I0506 16:54:07.437037 166 manifests.go:104] [control-plane] adding volume “kubeconfig” for component “kube-scheduler” I0506 16:54:07.437718 166 manifests.go:121] [control-plane] wrote static Pod manifest for component “kube-scheduler” to “/etc/kubernetes/manifests/kube-scheduler.yaml” [etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests” I0506 16:54:07.439832 166 local.go:72] [etcd] wrote Static Pod manifest for a local etcd member to “/etc/kubernetes/manifests/etcd.yaml” I0506 16:54:07.439886 166 waitcontrolplane.go:87] [wait-control-plane] Waiting for the API server to be healthy I0506 16:54:07.441481 166 loader.go:375] Config loaded from file: /etc/kubernetes/admin.conf [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s I0506 16:54:07.448725 166 round_trippers.go:443] GET https://kind-control-plane:6443/healthz?timeout=10s in 2 milliseconds … (GET to /healthz many times) [kubelet-check] Initial timeout of 40s passed. … (GET to /healthz many times) I0506 16:58:07.178131 166 round_trippers.go:443] GET https://kind-control-plane:6443/healthz?timeout=10s in 3 milliseconds couldn’t initialize a Kubernetes cluster k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:114 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1 /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207 k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1 /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:147 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864 k8s.io/kubernetes/cmd/kubeadm/app.Run /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50 main.main _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25 runtime.main /usr/local/go/src/runtime/proc.go:203 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1357 error execution phase wait-control-plane k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1 /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:422 k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207 k8s.io/kubernetes/cmd/kubeadm/app/cmd.NewCmdInit.func1 /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:147 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:826 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:914 k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:864 k8s.io/kubernetes/cmd/kubeadm/app.Run /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50 main.main _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25 runtime.main /usr/local/go/src/runtime/proc.go:203 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1357Unfortunately, an error has occurred: timed out waiting for the condition
This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - ‘systemctl status kubelet’ - ‘journalctl -xeu kubelet’
Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl: - ‘crictl --runtime-endpoint /run/containerd/containerd.sock ps -a | grep kube | grep -v pause’ Once you have found the failing container, you can inspect its logs with: - ‘crictl --runtime-endpoint /run/containerd/containerd.sock logs CONTAINERID’
So apparently the Kubelet never replies to GET https://kind-control-plane:6443/healthz?timeout=10s
requests.
What you expected to happen: I expected the cluster to boot successfully.
How to reproduce it (as minimally and precisely as possible): As explained above.
Anything else we need to know?:
If I simply run kind create cluster
a Kubernetes v1.18.2 cluster gets created successfully.
Follow the logs from the node container when running kind create cluster --image kindest/node:latest
, notice that there are some “Failed to…” messages; they show up even for the successful kind create cluster
case.
INFO: ensuring we can execute /bin/mount even with userns-remap INFO: remounting /sys read-only INFO: making mounts shared INFO: fix cgroup mounts for all subsystems INFO: clearing and regenerating /etc/machine-id Initializing machine ID from random generator. INFO: faking /sys/class/dmi/id/product_name to be “kind” INFO: faking /sys/class/dmi/id/product_uuid to be random INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well INFO: setting iptables to detected mode: legacy INFO: Detected IPv4 address: 172.19.0.2 INFO: Detected IPv6 address: fc00:f853:ccd:e793::2 Failed to find module ‘autofs4’ systemd 242 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid) Detected virtualization docker. Detected architecture x86-64. Failed to create symlink /sys/fs/cgroup/net_cls: File exists Failed to create symlink /sys/fs/cgroup/net_prio: File exists Failed to create symlink /sys/fs/cgroup/cpuacct: File exists Failed to create symlink /sys/fs/cgroup/cpu: File exists
Welcome to Ubuntu 19.10!
Set hostname to <kind-control-plane>. Failed to bump fs.file-max, ignoring: Invalid argument Configuration file /kind/systemd/kubelet.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway. Configuration file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway. [UNSUPP] Starting of Arbitrary Exec…Automount Point not supported. [ OK ] Listening on Journal Socket. [ OK ] Listening on Journal Socket (/dev/log). Starting Create list of re…odes for the current kernel… [ OK ] Reached target Slices. Mounting Huge Pages File System… [ OK ] Started Dispatch Password …ts to Console Directory Watch. [ OK ] Reached target Paths. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Listening on Journal Audit Socket. Starting Journal Service… [ OK ] Reached target Sockets. Mounting FUSE Control File System… Mounting Kernel Debug File System… [ OK ] Reached target Swap. Starting Remount Root and Kernel File Systems… Starting Apply Kernel Variables… [ OK ] Started Create list of req… nodes for the current kernel. [ OK ] Mounted Huge Pages File System. [ OK ] Mounted FUSE Control File System. [ OK ] Mounted Kernel Debug File System. [ OK ] Started Remount Root and Kernel File Systems. Starting Create System Users… Starting Update UTMP about System Boot/Shutdown… [ OK ] Started Apply Kernel Variables. [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Started Create System Users. Starting Create Static Device Nodes in /dev… [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Reached target Local File Systems (Pre). [ OK ] Reached target Local File Systems. [ OK ] Started Journal Service. [ OK ] Reached target System Initialization. [ OK ] Reached target Basic System. Starting containerd container runtime… [ OK ] Started kubelet: The Kubernetes Node Agent. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. Starting Flush Journal to Persistent Storage… [ OK ] Started containerd container runtime. [ OK ] Reached target Multi-User System. [ OK ] Reached target Graphical Interface. Starting Update UTMP about System Runlevel Changes… [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started Update UTMP about System Runlevel Changes.
Environment:
- kind version: (use
kind version
): bothv0.8.0
andv0.8.1
- Kubernetes version: (use
kubectl version
): kubectl isv1.18.0
; Kubernetes is at commit 0a6c826d3e92dae8f20d6199d0ac7deeca9eed71 from master (latest commit at the time of this writing) - Docker version: (use
docker info
):19.03.8
- OS (e.g. from
/etc/os-release
):Mac OS X 10.14.6
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 36 (32 by maintainers)
update: there are some PRs in flight regarding klog.
rollback doesn’t seem to be an option, it’s in too many repos and they will want to roll forward.
I’m going to try to devote some more time to helping get these in soon.
confirmed that this is fixed
Same issue in a different context, if it helps us narrow down https://github.com/kubernetes/kubernetes/issues/91795
I have just successfully created a cluster from the latest Kubernetes master branch. Thanks!
@BenTheElder yes it did ! https://github.com/google/cadvisor/compare/8af10c683a96...6a8d61401ea9