kubernetes: Kubelet can't start inside LXD on a BTRFS volume

Required information

  • Distribution: Debian
  • Distribution version: 10.0 (Buster)
  • Kubernetes version: 1.15.1
  • Docker version: 18.09
  • Kernel version: 4.19.0-5-amd64
  • LXC version: snap
  • Storage backend in use: BTRFS

Issue description

I’m trying to spin up a kubernetes cluster inside LXD managed containers. My LXD storage pool is the BTRFS one. Everything is going smooth, until I actually try to spin up the kubeadm init, which should bring up the kubelet and start infrastructure related pods. But the kubelet daemon simply dies and doesn’t spin up anything.

The two major problems are:

Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: I0725 21:48:23.606870     673 server.go:144] Starting to listen on 0.0.0.0:10250
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain kubelet[673]: F0725 21:48:23.607149     673 kubelet.go:1405] Failed to start OOM watcher open /dev/kmsg: no such file or directory
Jul 25 21:48:23 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION

and

Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: W0725 21:49:55.139425     796 fs.go:544] stat failed on /dev/nvme0n1p5 with error: no such file or directory
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain kubelet[796]: F0725 21:49:55.139431     796 kubelet.go:1370] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 70 in cached partitions map
Jul 25 21:49:55 onekube-ip-192-168-15-2.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION

So, kubelet simply trying to access /dev/kmsg and /dev/nvme0n1p5 which is the block device containing my BTRFS pool system.

I’m using the following settings from that article: https://github.com/corneliusweig/kubernetes-lxd

config:
  linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
  raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw
    sys:rw"
  security.privileged: "true"
  security.nesting: "true"

I can proceed further if I manually add those devices to /dev with mknod. And it would be kind of ok in the case of /dev/kmsg, but in case of BTRFS, different machine might have different block device name, major and minor numbers and so on, and I want to have an ability to automatically provision containers without bothering about hardware details. Also, I don’t feel like exposing the whole block device is “secure” by any means.

Any suggestions?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 3
  • Comments: 19 (3 by maintainers)

Most upvoted comments

This still occurs using Ubuntu 20.04, k3s 1.22, and btrfs.

/reopen

The issue still exists

Please do not mark this issue as stale. I still have this problem.

The currently workaround is to create a storage pool of the type dir .

-> % lxc storage list
+---------+-------------+--------+--------------------------------+---------+
|  NAME   | DESCRIPTION | DRIVER |             SOURCE             | USED BY |
+---------+-------------+--------+--------------------------------+---------+
| default |             | btrfs  | /var/lib/lxd/disks/default.img | 3       |
+---------+-------------+--------+--------------------------------+---------+
| k8s     |             | dir    | /var/lib/lxd/storage-pools/k8s | 4       |
+---------+-------------+--------+--------------------------------+---------+

Create the pool with lxc storage create k8s dir , you may need to tune to your settings. Then set it to be used on your k8s profile, like such:

-> % lxc profile show k8s
config:
  limits.cpu: "2"
  limits.memory: 2GB
  limits.memory.swap: "false"
  linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
  raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw
    sys:rw"
  security.nesting: "true"
  security.privileged: "true"
description: LXD profile for Kubernetes
devices:
  eth0:
    name: eth0
    nictype: bridged
    parent: lxdbr0
    type: nic
  root:
    path: /
    pool: k8s
    type: disk
name: k8s

And make you launch your containers with:

 lxc launch images:centos/7 kworker2 --profile=k8s

Hope this helps.