kubernetes: CSI volume reconstruction does not work for ephemeral volumes
When a pod is marked as deleted while kubelet is down / being restarted, newly started kubelet does not clean up CSI filesystem volumes of the pod.
Newly started kubelet tries to reconstruct the volume using CSI’s ConstructVolumeSpec function. This part looks working, CSI volume plugin loads its json file.
But then VolumeManager checks if the volume is still mounted in /var/lib/kubelet/pods/9440e7e5-d454-4555-84b7-d72e43ec4b3a/volumes/kubernetes.io~csi/pvc-45640a32-4ba3-4a7d-ad4b-087281f1460d/mount directory.
There are two issues:
-
CSI does not require volumes to be presented as mounts. They can be just directories with files on them. This will be case of the most of in-line volumes.
-
Even if the CSI driver used mount, kubelet mounts it intoKubelet checks the right directory given by/var/lib/kubelet/pods/9440e7e5-d454-4555-84b7-d72e43ec4b3a/volumes/kubernetes.io~csi/pvc-45640a32-4ba3-4a7d-ad4b-087281f1460d/mount. Checking of/var/lib/kubelet/pods/9440e7e5-d454-4555-84b7-d72e43ec4b3a/volumes/kubernetes.io~csi/pvc-45640a32-4ba3-4a7d-ad4b-087281f1460ddoes not make sense.GetPath()
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 29 (23 by maintainers)
Commits related to this issue
- Disable 'distruptive' volume test without force option until #79980 is fixed — committed to mkimuram/kubernetes by mkimuram 5 years ago
- e2e: restore volume lifecycle checks for hostpath driver These tests were previously disabled to work around #61446 and #79980 https://github.com/kubernetes/kubernetes/commit/f1e1f3a416b70bafadf96151... — committed to dobsonj/kubernetes by dobsonj 2 years ago
- e2e: restore volume lifecycle checks for csi-hostpath driver These tests were previously disabled to work around #79980 https://github.com/kubernetes/kubernetes/commit/f1e1f3a416b70bafadf961518c330ce... — committed to dobsonj/kubernetes by dobsonj 2 years ago
- e2e: restore volume lifecycle checks for csi-hostpath driver These tests were previously disabled to work around #79980 https://github.com/kubernetes/kubernetes/commit/f1e1f3a416b70bafadf961518c330ce... — committed to muyangren2/kubernetes by dobsonj 2 years ago
- e2e: restore volume lifecycle checks for csi-hostpath driver These tests were previously disabled to work around #79980 https://github.com/kubernetes/kubernetes/commit/f1e1f3a416b70bafadf961518c330ce... — committed to dobsonj/kubernetes by dobsonj 2 years ago
- e2e: restore volume lifecycle checks for csi-hostpath driver These tests were previously disabled to work around #79980 https://github.com/kubernetes/kubernetes/commit/f1e1f3a416b70bafadf961518c330ce... — committed to dobsonj/kubernetes by dobsonj 2 years ago
There is an ongoing work that changes
IsNotMountPointto utilizeopenat2(2)syscall to detect mount point by usingMountedFast. By usingopenat2(2), bind mount will be properly detected fast, but it requires kernel version is v5.6 or later. So, we would be able to also utilizeopenat2(2)inIsLikelyNotMountPoint. Then, the issue of bind mount can be resolved at least for kernel v5.6+, and we will be able to focus on how we solve this issue for old kernels.I added a couple of extra debug lines to that lines in the actual_state_of_the_world.go file, and can see that when DeletePodFromVolume tries to see if the volume exists, it doesn’t find it in the asw.attachedVolumes struct, like so:
Probably we should search for the name of the attached volume differently for ephemeral volumes?