kind: Amazon Linux 2: Unable to create multiple cluster using kind 0.20.0 with cgroup v1/cgroupns

What happened:

After upgrading to kind 0.20.0 from 0.19.0, I’ve been unable to create more than one cluster on the same host (cgroup v1/cgroupns). This seems to systematically happen while creating the second cluster. Looking at the logs extracted through export logs, I can see a lot of cgroup errors during the second cluster creation. Such as:

Failed to attach 3329 to compat systemd cgroup /kubelet.slice/kubelet.service: No such file or directory

Failed to migrate controller cgroups from /kubelet.slice, ignoring: Input/output error

Those messages do not appear while creating the first cluster. It was working fine in kind 0.19.0

What you expected to happen:

How to reproduce it (as minimally and precisely as possible): kind create cluster --name k8s-test-1 --config <base config specifying one control plane node> kind create cluster --name k8s-test-2 --config <base config specifying one control plane node>

Anything else we need to know?:

2023-08-29T20:32:18.1955697Z INFO: ensuring we can execute mount/umount even with userns-remap 2023-08-29T20:32:18.1955844Z INFO: remounting /sys read-only 2023-08-29T20:32:18.1955950Z INFO: making mounts shared 2023-08-29T20:32:18.1956050Z INFO: detected cgroup v1 2023-08-29T20:32:18.1956146Z INFO: detected cgroupns

Environment:

  • kind version: (use kind version):

kind v0.20.0 go1.21.0 linux/amd64

  • Runtime info: (use docker info or podman info):

2023-08-29T20:32:18.1984212Z 🗃️ /tmp/kind-logs3700965724/docker-info.txt 2023-08-29T20:32:18.1984293Z Client: 2023-08-29T20:32:18.1984394Z Context: default 2023-08-29T20:32:18.1984489Z Debug Mode: false 2023-08-29T20:32:18.1984900Z Plugins: 2023-08-29T20:32:18.1985053Z buildx: Docker Buildx (Docker Inc., v0.9.1) 2023-08-29T20:32:18.1985061Z 2023-08-29T20:32:18.1985136Z Server: 2023-08-29T20:32:18.1985230Z Containers: 2 2023-08-29T20:32:18.1985318Z Running: 2 2023-08-29T20:32:18.1985405Z Paused: 0 2023-08-29T20:32:18.1985492Z Stopped: 0 2023-08-29T20:32:18.1985576Z Images: 895 2023-08-29T20:32:18.1985680Z Server Version: 20.10.23 2023-08-29T20:32:18.1985788Z Storage Driver: overlay2 2023-08-29T20:32:18.1985897Z Backing Filesystem: xfs 2023-08-29T20:32:18.1986004Z Supports d_type: true 2023-08-29T20:32:18.1986120Z Native Overlay Diff: true 2023-08-29T20:32:18.1986216Z userxattr: false 2023-08-29T20:32:18.1986361Z Logging Driver: json-file 2023-08-29T20:32:18.1986465Z Cgroup Driver: cgroupfs 2023-08-29T20:32:18.1986557Z Cgroup Version: 1 2023-08-29T20:32:18.1986639Z Plugins: 2023-08-29T20:32:18.1986734Z Volume: local 2023-08-29T20:32:18.1986896Z Network: bridge host ipvlan macvlan null overlay 2023-08-29T20:32:18.1987202Z Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog 2023-08-29T20:32:18.1987293Z Swarm: inactive 2023-08-29T20:32:18.1987506Z Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc 2023-08-29T20:32:18.1987606Z Default Runtime: runc 2023-08-29T20:32:18.1987743Z Init Binary: docker-init 2023-08-29T20:32:18.1987917Z containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f 2023-08-29T20:32:18.1988076Z runc version: f19387a6bec4944c770f7668ab51c4348d9c2f38 2023-08-29T20:32:18.1988175Z init version: de40ad0 2023-08-29T20:32:18.1988271Z Security Options: 2023-08-29T20:32:18.1988353Z seccomp 2023-08-29T20:32:18.1988456Z Profile: default 2023-08-29T20:32:18.1988639Z Kernel Version: 4.14.322-244.536.amzn2.x86_64 2023-08-29T20:32:18.1988758Z Operating System: Amazon Linux 2 2023-08-29T20:32:18.1988846Z OSType: linux 2023-08-29T20:32:18.1988948Z Architecture: x86_64 2023-08-29T20:32:18.1989028Z CPUs: 16 2023-08-29T20:32:18.1989127Z Total Memory: 30.62GiB 2023-08-29T20:32:18.1989336Z Name: ip-10-102-229-153.us-west-1.compute.internal 2023-08-29T20:32:18.1989507Z ID: WZTV:2HIE:HWCS:H3L6:WXPF:4ARE:AN7L:RWDH:CQYJ:WL47:W76G:EXLV 2023-08-29T20:32:18.1989684Z Docker Root Dir: /var/lib/docker 2023-08-29T20:32:18.1989776Z Debug Mode: false 2023-08-29T20:32:18.1989918Z Registry: https://index.docker.io/v1/ 2023-08-29T20:32:18.1989996Z Labels: 2023-08-29T20:32:18.1990137Z Experimental: false 2023-08-29T20:32:18.1990238Z Insecure Registries: 2023-08-29T20:32:18.1990323Z 127.0.0.0/8 2023-08-29T20:32:18.1990432Z Live Restore Enabled: false

  • OS (e.g. from /etc/os-release): Amazon Linux 2
  • Kubernetes version: (use kubectl version): 1.27.3
  • Any proxies or other special environment settings?: n/a

kind-logs.txt.zip

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Reactions: 1
  • Comments: 15 (9 by maintainers)

Most upvoted comments

That’s what we’re looking into

AL2 was scheduled to reach the end of support earlier two months ago (see https://aws.amazon.com/amazon-linux-2/faqs/). It has been extended recently from 2023-06-30 to 2025-06-30 to provide more time for migration to AL2023.

I would recommend moving to AL2023 where I’ve confirmed this is not an issue.