kubernetes: Kubelet cannot finish terminating a pod that uses a PVC with `volumeMode: Block` if the underlying block device is accidentally disconnected from the node
What happened?
A pod uses a PVC with volumeMode: Block
. The block volume is backed by a block device on the node (which is created by some CSI storage providers). If the underlying block device corresponding to the volume is accidentally disconnected from the node (due to network, hardware problem, etc…), the user cannot delete the pod. The pod would be stuck in terminating forever since Kubelet refuses to unmount the block volume. This prevents the user from cleaning up the pod and spinning up a new replacement pod thus leading to a long service degradation. For example, if the pod is part of a statefulset, the replacement pod cannot come up due to the old pod being stuck terminating.
What did you expect to happen?
Kubelet should be able to finish terminating a pod when the corresponding block device is already removed from the node.
How can we reproduce it (as minimally and precisely as possible)?
-
Install a CSI storage provider that supports
volumeMode: Block
. For example, Longhorn -
Deloy a statefulset that has volumeClaimTemplates with
volumeMode: Block
. For example:click to open yaml manifest
apiVersion: v1 kind: Service metadata: name: test-bv labels: app: test-bv spec: ports: - port: 80 name: test-bv selector: app: test-bv type: NodePort --- apiVersion: apps/v1 kind: StatefulSet metadata: name: test-bv namespace: default spec: selector: matchLabels: app: test-bv serviceName: "test-bv" template: metadata: labels: app: test-bv spec: restartPolicy: Always terminationGracePeriodSeconds: 10 containers: - name: test-bv image: k8s.gcr.io/nginx-slim:0.8 command: - sleep - infinity ports: - containerPort: 80 name: test-bv volumeDevices: - devicePath: /dev/longhorn/test-blk name: test-bv-volume volumeClaimTemplates: - metadata: name: test-bv-volume spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "longhorn" volumeMode: Block resources: requests: storage: 1Gi
-
Simulate a network/hardware problem in which the backing block device is removed from the host unexpectedly. In our example, we delete a Longhorn engine instance manager which leads to the underlying block device being removed from the host.
-
Now that the pod cannot access the block volume, try to delete the pod.
-
The pod is stuck in terminating state forever. Logs from kubelet:
Click to open logs
I0324 04:21:58.069151 1232 kubelet.go:1948] "SyncLoop DELETE" source="api" pods=[default/test-bv-0] 2022-03-24 04:21:58.068 [INFO][4186656] k8s.go 563: Cleaning up netns ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.068 [INFO][4186656] dataplane_linux.go 473: Calico CNI deleting device in netns /proc/4183752/ns/net ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.090 [INFO][4186656] dataplane_linux.go 490: Calico CNI deleted device in netns /proc/4183752/ns/net ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.090 [INFO][4186656] k8s.go 570: Releasing IP address(es) ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.090 [INFO][4186656] utils.go 196: Calico CNI releasing IP address ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.090 [INFO][4186656] utils.go 212: Using a dummy podCidr to release the IP ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" podCidr="0.0.0.0/0" 2022-03-24 04:21:58.090 [INFO][4186656] utils.go 331: Calico CNI fetching podCidr from Kubernetes ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.090 [INFO][4186656] utils.go 337: Fetched podCidr ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" podCidr="0.0.0.0/0" 2022-03-24 04:21:58.090 [INFO][4186656] utils.go 340: Calico CNI passing podCidr to host-local IPAM: 0.0.0.0/0 ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" 2022-03-24 04:21:58.098 [INFO][4186656] k8s.go 576: Teardown processing complete. ContainerID="98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba" I0324 04:21:58.763182 1232 kubelet.go:1970] "SyncLoop (PLEG): event for pod" pod="default/test-bv-0" event=&{ID:7c5b2ecb-cda5-4e72-95db-4e133ba3a563 Type:ContainerDied Data:a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b} I0324 04:21:58.763283 1232 kubelet.go:1970] "SyncLoop (PLEG): event for pod" pod="default/test-bv-0" event=&{ID:7c5b2ecb-cda5-4e72-95db-4e133ba3a563 Type:ContainerDied Data:98992de02709e5051a3c4f503f5e7ae797143218b286f89edd3d1fb3b03156ba} I0324 04:21:58.763325 1232 scope.go:111] "RemoveContainer" containerID="a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b" I0324 04:21:58.786653 1232 scope.go:111] "RemoveContainer" containerID="a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b" E0324 04:21:58.787657 1232 remote_runtime.go:334] "ContainerStatus from runtime service failed" err="rpc error: code = Unknown desc = Error: No such container: a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b" containerID="a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b" I0324 04:21:58.787730 1232 pod_container_deletor.go:52] "DeleteContainer returned error" containerID={Type:docker ID:a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b} err="failed to get container status \"a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b\": rpc error: code = Unknown desc = Error: No such container: a698192f7a83a0abf1dfc8f8f863134d3eb8a732ea9952f82bcf35316622292b" I0324 04:21:58.921232 1232 reconciler.go:196] "operationExecutor.UnmountVolume started for volume \"kube-api-access-8mctj\" (UniqueName: \"kubernetes.io/projected/7c5b2ecb-cda5-4e72-95db-4e133ba3a563-kube-api-access-8mctj\") pod \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\" (UID: \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\") " I0324 04:21:58.921523 1232 reconciler.go:196] "operationExecutor.UnmountVolume started for volume \"test-bv-volume\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec\") pod \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\" (UID: \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\") " I0324 04:21:58.925924 1232 operation_generator.go:829] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/7c5b2ecb-cda5-4e72-95db-4e133ba3a563-kube-api-access-8mctj" (OuterVolumeSpecName: "kube-api-access-8mctj") pod "7c5b2ecb-cda5-4e72-95db-4e133ba3a563" (UID: "7c5b2ecb-cda5-4e72-95db-4e133ba3a563"). InnerVolumeSpecName "kube-api-access-8mctj". PluginName "kubernetes.io/projected", VolumeGidValue "" W0324 04:21:58.982588 1232 volume_path_handler_linux.go:61] couldn't find loopback device which takes file descriptor lock. Skip detaching device. device path: "/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563" I0324 04:21:58.982793 1232 mount_linux.go:197] Detected OS without systemd E0324 04:21:58.987894 1232 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec podName:7c5b2ecb-cda5-4e72-95db-4e133ba3a563 nodeName:}" failed. No retries permitted until 2022-03-24 04:21:59.487840694 +0000 UTC m=+1184606.200514039 (durationBeforeRetry 500ms). Error: "UnmapVolume.UnmapBlockVolume failed for volume \"test-bv-volume\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec\") pod \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\" (UID: \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\") : blkUtil.DetachFileDevice failed. globalUnmapPath:/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev, podUID: 7c5b2ecb-cda5-4e72-95db-4e133ba3a563, bindMount: true: failed to unmount linkPath /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563: unmount failed: exit status 32\nUnmounting arguments: /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563\nOutput: umount: /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563: target is busy.\n" I0324 04:21:59.038425 1232 reconciler.go:319] "Volume detached for volume \"kube-api-access-8mctj\" (UniqueName: \"kubernetes.io/projected/7c5b2ecb-cda5-4e72-95db-4e133ba3a563-kube-api-access-8mctj\") on node \"shuo-k8s-worker3\" DevicePath \"\"" I0324 04:21:59.541916 1232 reconciler.go:196] "operationExecutor.UnmountVolume started for volume \"test-bv-volume\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec\") pod \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\" (UID: \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\") " W0324 04:21:59.594618 1232 volume_path_handler_linux.go:61] couldn't find loopback device which takes file descriptor lock. Skip detaching device. device path: "/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563" W0324 04:21:59.594663 1232 volume_path_handler.go:219] Warning: Unmap skipped because symlink does not exist on the path: /var/lib/kubelet/pods/7c5b2ecb-cda5-4e72-95db-4e133ba3a563/volumeDevices/kubernetes.io~csi/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec I0324 04:21:59.594782 1232 mount_linux.go:197] Detected OS without systemd E0324 04:21:59.599253 1232 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec podName:7c5b2ecb-cda5-4e72-95db-4e133ba3a563 nodeName:}" failed. No retries permitted until 2022-03-24 04:22:00.599185856 +0000 UTC m=+1184607.311859193 (durationBeforeRetry 1s). Error: "UnmapVolume.UnmapBlockVolume failed for volume \"test-bv-volume\" (UniqueName: \"kubernetes.io/csi/driver.longhorn.io^pvc-417a59df-2c2b-44a1-9350-07b108bf96ec\") pod \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\" (UID: \"7c5b2ecb-cda5-4e72-95db-4e133ba3a563\") : blkUtil.DetachFileDevice failed. globalUnmapPath:/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev, podUID: 7c5b2ecb-cda5-4e72-95db-4e133ba3a563, bindMount: true: failed to unmount linkPath /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563: unmount failed: exit status 32\nUnmounting arguments: /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563\nOutput: umount: /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563: target is busy.\n"
Please pay attention to the log
W0324 04:21:59.594618 1232 volume_path_handler_linux.go:61] couldn't find loopback device which takes file descriptor lock. Skip detaching device. device path: "/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/7c5b2ecb-cda5-4e72-95db-4e133ba3a563"
. Kubelet is refusing to terminate the pod because the underlying block device is removed from the host. If we runlosetup
on the host, we can see that the backing file of the corresponding loop device is marked asdeleted
:root@shuo-k8s-worker3:~# losetup NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC /dev/loop1 0 0 1 1 /var/lib/snapd/snaps/lxd_21835.snap 0 512 /dev/loop8 0 0 0 0 / (deleted) 0 512 /dev/loop6 0 0 0 0 / (deleted) 0 512 /dev/loop4 0 0 1 1 /var/lib/snapd/snaps/snapd_14978.snap 0 512 /dev/loop2 0 0 1 1 /var/lib/snapd/snaps/core20_1376.snap 0 512 /dev/loop0 0 0 1 1 /var/lib/snapd/snaps/core20_1361.snap 0 512 /dev/loop9 0 0 0 0 /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/3447f8b4-b54a-49df-b677-ffab62fc68be (deleted) 0 512 /dev/loop5 0 0 1 1 /var/lib/snapd/snaps/snapd_15177.snap 0 512 /dev/loop3 0 0 1 1 /var/lib/snapd/snaps/lxd_22526.snap 0 512 root@shuo-k8s-worker3:~# cat /sys/block/loop9/loop/backing_file /var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-417a59df-2c2b-44a1-9350-07b108bf96ec/dev/3447f8b4-b54a-49df-b677-ffab62fc68be (deleted)
In our humble opinion, we think that Kubelet should process to remove the loop device corresponding to the deleted block device so that pod can finish terminating as detailed in this PR: https://github.com/kubernetes/kubernetes/pull/109083. We have tested that PR and can confirm that it fixes the issue.
Anything else we need to know?
We have prepared a PR at : https://github.com/kubernetes/kubernetes/pull/109083. We have tested that PR and can confirm that it fixes the issue.
Kubernetes version
v1.22.7
Cloud provider
OS version
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (2 by maintainers)
kubelet log
13:31:57 wanguo-72 kubelet[1131]: E0428 13:31:57.597111 1131 nestedpendingoperations.go:348] Operation for "{volumeName:kubernetes.io/csi/topolvm.cybozu.com^a27f2f53-54fa-4068-bbad-cfbe70c0815c podName:8519283b-9521-4aed-9bfe-8f6687f99691 nodeName:}" failed. No retries permitted until 2023-04-28 13:33:59.597061528 +0800 CST m=+8838.750882463 (durationBeforeRetry 2m2s). Error: UnmapVolume.UnmapBlockVolume failed for volume "pvc-e2bc365a-a4e4-48a7-9057-af87b303f8fe" (UniqueName: "kubernetes.io/csi/topolvm.cybozu.com^a27f2f53-54fa-4068-bbad-cfbe70c0815c") pod "8519283b-9521-4aed-9bfe-8f6687f99691" (UID: "8519283b-9521-4aed-9bfe-8f6687f99691") : blkUtil.DetachFileDevice failed. globalUnmapPath:, podUID: 8519283b-9521-4aed-9bfe-8f6687f99691, bindMount: true: failed to unmap device from map path. mapPath is empty
node poweroff restart , pod use block mode volume, and pod stuck in terminating