kind: kind create cluster fails on MacOS + docker desktop
What happened:
~ kind create cluster
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.25.3) 🖼
✓ Preparing nodes 📦
✓ Writing configuration 📜
✗ Starting control-plane 🕹️
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I0111 12:08:48.281449 132 initconfiguration.go:254] loading configuration from "/kind/kubeadm.conf"
W0111 12:08:48.282858 132 initconfiguration.go:331] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.25.3
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0111 12:08:48.289153 132 certs.go:112] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
I0111 12:08:48.393822 132 certs.go:522] validating certificate period for ca certificate
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kind-control-plane kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local localhost] and IPs [10.96.0.1 172.23.0.2 127.0.0.1]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0111 12:08:48.827200 132 certs.go:112] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
I0111 12:08:49.033611 132 certs.go:522] validating certificate period for front-proxy-ca certificate
[certs] Generating "front-proxy-client" certificate and key
I0111 12:08:49.219289 132 certs.go:112] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
I0111 12:08:49.319645 132 certs.go:522] validating certificate period for etcd/ca certificate
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.23.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kind-control-plane localhost] and IPs [172.23.0.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0111 12:08:49.742862 132 certs.go:78] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
I0111 12:08:49.876848 132 kubeconfig.go:103] creating kubeconfig file for admin.conf
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
I0111 12:08:50.089378 132 kubeconfig.go:103] creating kubeconfig file for kubelet.conf
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0111 12:08:50.161976 132 kubeconfig.go:103] creating kubeconfig file for controller-manager.conf
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0111 12:08:50.598000 132 kubeconfig.go:103] creating kubeconfig file for scheduler.conf
[kubeconfig] Writing "scheduler.conf" kubeconfig file
I0111 12:08:50.653232 132 kubelet.go:66] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0111 12:08:50.770349 132 manifests.go:99] [control-plane] getting StaticPodSpecs
I0111 12:08:50.770602 132 certs.go:522] validating certificate period for CA certificate
I0111 12:08:50.770699 132 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0111 12:08:50.770705 132 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0111 12:08:50.770709 132 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0111 12:08:50.770712 132 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0111 12:08:50.770716 132 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0111 12:08:50.773904 132 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
I0111 12:08:50.773964 132 manifests.go:99] [control-plane] getting StaticPodSpecs
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0111 12:08:50.774644 132 manifests.go:125] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0111 12:08:50.774687 132 manifests.go:125] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0111 12:08:50.774693 132 manifests.go:125] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0111 12:08:50.774697 132 manifests.go:125] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0111 12:08:50.774701 132 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0111 12:08:50.774705 132 manifests.go:125] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0111 12:08:50.774709 132 manifests.go:125] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0111 12:08:50.776035 132 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
I0111 12:08:50.776106 132 manifests.go:99] [control-plane] getting StaticPodSpecs
I0111 12:08:50.776262 132 manifests.go:125] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0111 12:08:50.776771 132 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0111 12:08:50.777457 132 local.go:65] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0111 12:08:50.777499 132 waitcontrolplane.go:83] [wait-control-plane] Waiting for the API server to be healthy
I0111 12:08:50.777977 132 loader.go:374] Config loaded from file: /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0111 12:08:50.781500 132 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s in 1 milliseconds
I0111 12:08:51.282917 132 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s in 0 milliseconds
I0111 12:08:51.783460 132 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s in 1 milliseconds
I0111 12:08:52.283225 132 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s in 0 milliseconds
.....
........
...............
I0111 12:12:50.788384 132 round_trippers.go:553] GET https://kind-control-plane:6443/healthz?timeout=10s in 0 milliseconds
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:154
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
cmd/kubeadm/app/cmd/init.go:154
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
What you expected to happen: A cluster has been created
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Additional logs attached to the issue as .zip archive. kind-control-plane.zip
Environment:
- kind version: (use
kind version
):
~ kind --version
kind version 0.17.0
- Runtime info: (use
docker info
orpodman info
): - OS (e.g. from
/etc/os-release
): MacOSx + docker desktop (Docker Desktop 4.15.0 (93002) is currently the newest version available.) - Kubernetes version: (use
kubectl version
):
~ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.2", GitCommit:"5835544ca568b757a8ecae5c153f317e5736700e", GitTreeState:"clean", BuildDate:"2022-09-21T14:33:49Z", GoVersion:"go1.19.1", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.6", GitCommit:"ad3338546da947756e8a88aa6822e9c11e7eac22", GitTreeState:"clean", BuildDate:"2022-04-14T08:43:11Z", GoVersion:"go1.17.9", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.25) and server (1.23) exceeds the supported minor version skew of +/-1
~ docker info
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.9.1)
compose: Docker Compose (Docker Inc., v2.13.0)
dev: Docker Dev Environments (Docker Inc., v0.0.5)
extension: Manages Docker extensions (Docker Inc., v0.2.16)
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
scan: Docker Scan (Docker Inc., v0.22.0)
Server:
Containers: 10
Running: 1
Paused: 0
Stopped: 9
Images: 26
Server Version: 20.10.21
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 770bd0108c32f3fb5c73ae1264f7e503fe7b2661
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.15.49-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.675GiB
Name: docker-desktop
ID: ERYV:IPQQ:OQAQ:GX7W:XYED:ICON:W4GJ:A3V2:45F5:GTRB:OY3H:IFZZ
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false
- Any proxies or other special environment settings?: nope
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 22 (12 by maintainers)
I’ve just stumbled across the same issue (also using my daily driver mac with docker desktop) and did some debugging. One thing that immediately caught my eye is that my
journal.log
was full of error messages like this:Jan 20 16:19:32 kind-control-plane kubelet[184]: E0120 16:19:32.302684 184 pod_workers.go:965] "Error syncing pod, skipping" err="failed to \"CreatePodS andbox\" for \"kube-scheduler-kind-control-plane_kube-system(6d3dda2cad9846e0d48dbd5d5b9f59fc)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-scheduler-kind-control-plane_kube-system(6d3dda2cad9846e0d48dbd5d5b9f59fc)\\\": rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: expected cgroupsPath to be of format \\\"slice:prefix:name\\\" for systemd cgroups, got \\\"/kubelet/kubepods/burstable/pod6d3dda2cad9846e0d48dbd5d5b9f59fc/5bc1337ac55d891c743783740d686e714686e063bb37969ac965f44f2ab091de\\\" instead: unknown\"" pod="kube-system/kube-scheduler-kind-control-plane" podUID=6d3dda2cad9846e0d48dbd5d5b9f59fc
While doing some research on that error, I found this thread which explained the cause nicely: https://github.com/containerd/containerd/issues/4857#issuecomment-747238907. So next, I did what was suggested in that post - I created a custom KubeletConfiguration that explicitly sets systemd as the cgroupDriver and put that into my kind config file. And now it actually works for me again. Here is the relevant part of my kind config for reference:
Hope this helps you out!
@BenTheElder can report that I also started running into this same problem where the health check fails when setting up the control plane.
Recently returned to a project, after not touching it for a few months. It was utilizing docker desktop and kind without issue on macos ARM. Docker desktop logs were reporting issues with privileged ports (which were enabled).
Tried:
Then I went back to docker desktop (v4.21.1) and it worked. I suspected updating the nix channel in my project did the trick, it bumped a few deps.
So I dropped back down and it still works. Removed the cgroup patch, still works. So I’m a bit clueless as to what resolved it. 🤷
Hopefully this is useful info. I’m unstuck for now, so no worries here.
@vallpaper that would be #2718