kind: Can not create a cluster when running on BTRFS + LUKS encryption

What happened:

When starting a kind cluster on an encrypted btrfs root partition the control-plane won’t start up, because of an error in the kubelet:

Aug 11 07:33:59 kind-control-plane kubelet[833]: W0811 07:33:59.653820     833 fs.go:588] stat failed on /dev/mapper/luks-a389c146-db36-4c96-bcbc-0fa3f5f3fcd1 with error: no such file or directory
Aug 11 07:33:59 kind-control-plane kubelet[833]: E0811 07:33:59.653846     833 kubelet.go:1423] "Failed to start ContainerManager" err="failed to get rootfs info: failed to get device for dir \"/var/lib/kubelet\": could not find device with major: 0, minor: 40 in cached partitions map"

On the host the luks path is a symlink:

ls -la /dev/mapper
total 0
drwxr-xr-x.  2 root root      80 Aug 11 08:43 .
drwxr-xr-x. 21 root root    4600 Aug 11 08:44 ..
crw-------.  1 root root 10, 236 Aug 11 08:43 control
lrwxrwxrwx.  1 root root       7 Aug 11 08:43 luks-a389c146-db36-4c96-bcbc-0fa3f5f3fcd1 -> ../dm-0

As this path is not available in the container it fails.

What you expected to happen:

All paths required inside kind should be mapped into the node.

How to reproduce it (as minimally and precisely as possible):

Attempt to create a cluster on an encrypted root partition - in my case I simply installed Fedora and chose to encrypt the system in the installer.

Anything else we need to know?:

The issue is quite simple to fix, by just also mounting the missing path into the container.

With the following configuration it will work:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /dev/dm-0
      containerPath: /dev/dm-0
      propagation: HostToContainer

Environment:

  • kind version: (use kind version): kind v0.11.1 go1.16.4 linux/amd64

  • Kubernetes version: (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-07-12T20:40:20Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info): not running docker, but rootless podman
  • OS (e.g. from /etc/os-release):
NAME=Fedora
VERSION="34 (Workstation Edition)"
ID=fedora
VERSION_ID=34
VERSION_CODENAME=""
PLATFORM_ID="platform:f34"
PRETTY_NAME="Fedora 34 (Workstation Edition)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:34"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/34/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=34
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=34
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 15 (10 by maintainers)

Commits related to this issue

Most upvoted comments

@bergmannf I can confirm that. kind create cluster worked without any further configuration or anything special on my Fedora 38, with either Docker version 23.0.1, build a5ee5b1, or rootless podman version 4.4.4 hosting the kind cluster, and with kind version 0.18.0, using BTRFS on LUKS.

Something somewhere done by somebody fixed this.

I did

KIND_EXPERIMENTAL_PROVIDER=podman ./kind-linux-amd64 create cluster
kubectl run my-shell --rm -i --tty --image ubuntu -- bash

I dug a little deeper since in my case there is no symlink missing, but my btrfs device is just not mounted automatically into the node. I use btrfs without LUKS therefore there are no /dev/mapper devices. Instead I just have a partitioned disk that looks like that (from the fstab perspective).

# /dev/nvme0n1p2
UUID=3e04c83b-1d81-4159-9411-b4ad5bdef790	/         	btrfs     	rw,relatime,discard=async,ssd,space_cache,subvolid=256,subvol=/@,subvol=@	0 0

# /dev/nvme0n1p2
UUID=3e04c83b-1d81-4159-9411-b4ad5bdef790	/home     	btrfs     	rw,relatime,discard=async,ssd,space_cache,subvolid=257,subvol=/@home,subvol=@home	0 0

Therefore the solution worked out in #1416 does not work in that setup. I’m using btrfs as storageDriver as well. Providing /dev/nvme0n1p2 as an extraMount succesfully works around that glitch.

Seeing the same thing as @dahrens … a stock Fedora installation with BTRFS everywhere. Using the following config file seems to have worked.

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /dev/nvme0n1p3
      containerPath: /dev/nvme0n1p3
      propagation: HostToContainer

I appreciate that this may be hard to resolve automatically, but it would be good to document it. What would it take to get this added to the “known issues” page? And can someone perhaps explain the nature of the problem? I get that it’s failing because something inside the control plane wants access to the host filesystem, but I don’t understand why it cares what’s happening at the device layer?