kubernetes: Kubelet can't start inside LXD on a BTRFS volume
Required information
- Distribution: Debian
- Distribution version: 10.0 (Buster)
- Kubernetes version: 1.15.1
- Docker version: 18.09
- Kernel version: 4.19.0-5-amd64
- LXC version: snap
- Storage backend in use: BTRFS
Issue description
I’m trying to spin up a kubernetes cluster inside LXD managed containers. My LXD storage pool is the BTRFS one. Everything is going smooth, until I actually try to spin up the kubeadm init, which should bring up the kubelet and start infrastructure related pods. But the kubelet daemon simply dies and doesn’t spin up anything.
The two major problems are:
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: I0725 21:48:23.606870 673 server.go:144] Starting to listen on 0.0.0.0:10250
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: F0725 21:48:23.607149 673 kubelet.go:1405] Failed to start OOM watcher open /dev/kmsg: no such file or directory
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
and
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: W0725 21:49:55.139425 796 fs.go:544] stat failed on /dev/nvme0n1p5 with error: no such file or directory
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: F0725 21:49:55.139431 796 kubelet.go:1370] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 70 in cached partitions map
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
So, kubelet simply trying to access /dev/kmsg and /dev/nvme0n1p5 which is the block device containing my BTRFS pool system.
I’m using the following settings from that article: https://github.com/corneliusweig/kubernetes-lxd
config:
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw
sys:rw"
security.privileged: "true"
security.nesting: "true"
I can proceed further if I manually add those devices to /dev with mknod. And it would be kind of ok in the case of /dev/kmsg, but in case of BTRFS, different machine might have different block device name, major and minor numbers and so on, and I want to have an ability to automatically provision containers without bothering about hardware details. Also, I don’t feel like exposing the whole block device is “secure” by any means.
Any suggestions?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 19 (3 by maintainers)
This still occurs using Ubuntu 20.04, k3s 1.22, and btrfs.
/reopen
The issue still exists
Please do not mark this issue as stale. I still have this problem.
The currently workaround is to create a storage pool of the type
dir.Create the pool with
lxc storage create k8s dir, you may need to tune to your settings. Then set it to be used on yourk8sprofile, like such:And make you launch your containers with:
Hope this helps.