kubernetes: Kubelet can't start inside LXD on a BTRFS volume
Required information
- Distribution: Debian
- Distribution version: 10.0 (Buster)
- Kubernetes version: 1.15.1
- Docker version: 18.09
- Kernel version: 4.19.0-5-amd64
- LXC version: snap
- Storage backend in use: BTRFS
Issue description
I’m trying to spin up a kubernetes cluster inside LXD managed containers. My LXD storage pool is the BTRFS one. Everything is going smooth, until I actually try to spin up the kubeadm init
, which should bring up the kubelet
and start infrastructure related pods. But the kubelet
daemon simply dies and doesn’t spin up anything.
The two major problems are:
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: I0725 21:48:23.606870 673 server.go:144] Starting to listen on 0.0.0.0:10250
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: F0725 21:48:23.607149 673 kubelet.go:1405] Failed to start OOM watcher open /dev/kmsg: no such file or directory
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
and
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: W0725 21:49:55.139425 796 fs.go:544] stat failed on /dev/nvme0n1p5 with error: no such file or directory
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: F0725 21:49:55.139431 796 kubelet.go:1370] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 70 in cached partitions map
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
So, kubelet
simply trying to access /dev/kmsg
and /dev/nvme0n1p5
which is the block device containing my BTRFS pool system.
I’m using the following settings from that article: https://github.com/corneliusweig/kubernetes-lxd
config:
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw
sys:rw"
security.privileged: "true"
security.nesting: "true"
I can proceed further if I manually add those devices to /dev
with mknod. And it would be kind of ok in the case of /dev/kmsg
, but in case of BTRFS, different machine might have different block device name, major and minor numbers and so on, and I want to have an ability to automatically provision containers without bothering about hardware details. Also, I don’t feel like exposing the whole block device is “secure” by any means.
Any suggestions?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 19 (3 by maintainers)
This still occurs using Ubuntu 20.04, k3s 1.22, and btrfs.
/reopen
The issue still exists
Please do not mark this issue as stale. I still have this problem.
The currently workaround is to create a storage pool of the type
dir
.Create the pool with
lxc storage create k8s dir
, you may need to tune to your settings. Then set it to be used on yourk8s
profile, like such:And make you launch your containers with:
Hope this helps.