ceph-csi: Mounting of RBD volume fails with input/output error

Describe the bug

Mounting of a volume fails with input/output error:

Events:
  Type     Reason       Age   From               Message
  ----     ------       ----  ----               -------
  Normal   Scheduled    100s  default-scheduler  Successfully assigned streaming/plex-5fbbf6694f-nrtl4 to node05
  Warning  FailedMount  4s    kubelet, node05    MountVolume.MountDevice failed for volume "pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe" : stat /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount: input/output error

Environment details

  • Cpeh-CSI 1.1.0
  • K8S 1.15.2

Logs

csi-attacher:

I0823 06:14:10.629412       1 trivial_handler.go:49] Trivial sync[csi-1e0a87deba8645981bc6515847386bbf1de391387f9f7626c9f99411326c94f4] started
I0823 06:14:10.629064       1 trivial_handler.go:49] Trivial sync[csi-fce5695c9222907527b14bb6a5d7c4d8c866838bf7ea740c9f5f2c2d461b1c4f] started
I0823 06:14:10.629086       1 controller.go:205] Started PV processing "pvc-06155c3d-0b8c-44a2-94aa-5b009b87308b"
I0823 06:14:10.629203       1 trivial_handler.go:49] Trivial sync[csi-2ca56e9afe3497aa222c8f4beea21313bc5c51736c6219df6cd23a5e0a294782] started
I0823 06:16:13.852041       1 reflector.go:370] k8s.io/client-go/informers/factory.go:133: Watch close - *v1beta1.VolumeAttachment total 0 items received
I0823 06:19:33.865865       1 reflector.go:370] k8s.io/client-go/informers/factory.go:133: Watch close - *v1.PersistentVolume total 0 items received

csi-provisioner:

0823 06:18:27.760597       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:32.775979       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:37.789708       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:42.803050       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:47.819521       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:52.838370       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:18:57.851769       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:02.870969       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:07.888328       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:12.904024       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:17.920981       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:21.333127       1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:803: Watch close - *v1.PersistentVolume total 0 items received
I0823 06:19:22.941350       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:27.955348       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:32.976616       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:37.993064       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:43.010305       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:48.024349       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:53.039748       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:19:58.057264       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:03.070592       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:08.093751       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:13.108916       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:14.334928       1 reflector.go:370] sigs.k8s.io/sig-storage-lib-external-provisioner/controller/controller.go:800: Watch close - *v1.PersistentVolumeClaim total 0 items received
I0823 06:20:18.123697       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:23.137800       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:28.155980       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:33.169902       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:38.186149       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:42.425644       1 controller.go:1308] delete "pvc-d2cd7ce3-f9d7-4828-a85b-b65f00f64c10": started
I0823 06:20:42.440631       1 connection.go:180] GRPC call: /csi.v1.Controller/DeleteVolume
I0823 06:20:42.440684       1 connection.go:181] GRPC request: {"secrets":"***stripped***","volume_id":"0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd"}
I0823 06:20:43.199552       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:43.405600       1 connection.go:183] GRPC response: {}
I0823 06:20:43.406349       1 connection.go:184] GRPC error: rpc error: code = Internal desc = rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used
E0823 06:20:43.406388       1 controller.go:1331] delete "pvc-d2cd7ce3-f9d7-4828-a85b-b65f00f64c10": volume deletion failed: rpc error: code = Internal desc = rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used
W0823 06:20:43.406473       1 controller.go:934] Retrying syncing volume "pvc-d2cd7ce3-f9d7-4828-a85b-b65f00f64c10", failure 145
E0823 06:20:43.406536       1 controller.go:952] error syncing volume "pvc-d2cd7ce3-f9d7-4828-a85b-b65f00f64c10": rpc error: code = Internal desc = rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used
I0823 06:20:43.406726       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolume", Namespace:"", Name:"pvc-d2cd7ce3-f9d7-4828-a85b-b65f00f64c10", UID:"b80891b4-d582-4f23-af20-447ec6342947", APIVersion:"v1", ResourceVersion:"2971666", FieldPath:""}): type: 'Warning' reason: 'VolumeFailedDelete' rpc error: code = Internal desc = rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used
I0823 06:20:48.212936       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:53.226800       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:20:58.240116       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:03.253608       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:08.265918       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:13.280058       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:18.293271       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:23.307719       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:28.321203       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com
I0823 06:21:33.431265       1 leaderelection.go:258] successfully renewed lease storing/rbd-csi-ceph-com

csi-rbdplugin:

I0823 06:20:42.442046       1 utils.go:105] GRPC call: /csi.v1.Controller/DeleteVolume
I0823 06:20:42.442117       1 utils.go:106] GRPC request: {"secrets":"***stripped***","volume_id":"0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd"}
I0823 06:20:43.290203       1 controllerserver.go:337] deleting image csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd
I0823 06:20:43.290269       1 rbd_util.go:140] rbd: status csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd using mon 192.168.2.100:6789, pool kubernetes
I0823 06:20:43.404993       1 rbd_util.go:159] rbd: watchers on csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd: 2019-08-23 06:20:43.351 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-08-23 06:20:43.355 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-08-23 06:20:43.355 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-08-23 06:20:43.363 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-08-23 06:20:43.363 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
2019-08-23 06:20:43.363 7f179691bb00 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.kubernetes.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
Watchers:
        watcher=192.168.2.155:0/2851647683 client.3215876 cookie=139784542502688
I0823 06:20:43.405066       1 rbd_util.go:176] rbd is still being used csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd
E0823 06:20:43.405121       1 controllerserver.go:339] failed to delete rbd image: kubernetes/csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd with error: rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used
E0823 06:20:43.405149       1 utils.go:109] GRPC error: rpc error: code = Internal desc = rbd csi-vol-edc5a3ec-c26b-11e9-9749-1ae57eb3c3fd is still being used

kubelet logs

Aug 23 06:42:02 node05 kubelet[924]: E0823 06:42:02.531869     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:04 node05 kubelet[924]: E0823 06:42:04.525745     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:06 node05 kubelet[924]: E0823 06:42:06.534544     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:08 node05 kubelet[924]: E0823 06:42:08.524713     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:10 node05 kubelet[924]: E0823 06:42:10.526565     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:12 node05 kubelet[924]: E0823 06:42:12.530299     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:14 node05 kubelet[924]: E0823 06:42:14.528843     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:16 node05 kubelet[924]: E0823 06:42:16.532023     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:18 node05 kubelet[924]: E0823 06:42:18.530003     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:20 node05 kubelet[924]: E0823 06:42:20.530837     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.260825     924 reconciler.go:177] operationExecutor.UnmountVolume started for volume "plex-token-fh9kp" (UniqueName: "kubernetes.io/secret/11dad472-e834-4114-be5f-91923ba01370-plex-token-fh9kp") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.260954     924 reconciler.go:177] operationExecutor.UnmountVolume started for volume "transcode-volume" (UniqueName: "kubernetes.io/empty-dir/11dad472-e834-4114-be5f-91923ba01370-transcode-volume") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.261032     924 reconciler.go:177] operationExecutor.UnmountVolume started for volume "movies-nfs-storage" (UniqueName: "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-movies-nfs-storage") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.261113     924 reconciler.go:177] operationExecutor.UnmountVolume started for volume "tvshows-nfs-storage" (UniqueName: "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-tvshows-nfs-storage") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370")
Aug 23 06:42:22 node05 kubelet[924]: W0823 06:42:22.261240     924 empty_dir.go:421] Warning: Failed to clear quota on /var/lib/kubelet/pods/11dad472-e834-4114-be5f-91923ba01370/volumes/kubernetes.io~empty-dir/transcode-volume: ClearQuota called, but quotas disabled
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.261408     924 operation_generator.go:860] UnmountVolume.TearDown succeeded for volume "kubernetes.io/empty-dir/11dad472-e834-4114-be5f-91923ba01370-transcode-volume" (OuterVolumeSpecName: "transcode-volume") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370"). InnerVolumeSpecName "transcode-volume". PluginName "kubernetes.io/empty-dir", VolumeGidValue ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.289135     924 operation_generator.go:860] UnmountVolume.TearDown succeeded for volume "kubernetes.io/secret/11dad472-e834-4114-be5f-91923ba01370-plex-token-fh9kp" (OuterVolumeSpecName: "plex-token-fh9kp") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370"). InnerVolumeSpecName "plex-token-fh9kp". PluginName "kubernetes.io/secret", VolumeGidValue ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.289918     924 operation_generator.go:860] UnmountVolume.TearDown succeeded for volume "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-tvshows-nfs-storage" (OuterVolumeSpecName: "tvshows-nfs-storage") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370"). InnerVolumeSpecName "plex-tvshows-nfs-storage". PluginName "kubernetes.io/nfs", VolumeGidValue ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.329993     924 operation_generator.go:860] UnmountVolume.TearDown succeeded for volume "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-movies-nfs-storage" (OuterVolumeSpecName: "movies-nfs-storage") pod "11dad472-e834-4114-be5f-91923ba01370" (UID: "11dad472-e834-4114-be5f-91923ba01370"). InnerVolumeSpecName "plex-movies-nfs-storage". PluginName "kubernetes.io/nfs", VolumeGidValue ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.375043     924 reconciler.go:203] operationExecutor.VerifyControllerAttachedVolume started for volume "transcode-volume" (UniqueName: "kubernetes.io/empty-dir/9b567d3f-3658-49dc-914f-bc49330885d5-transcode-volume") pod "plex-5fbbf6694f-4zxvb" (UID: "9b567d3f-3658-49dc-914f-bc49330885d5")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.375167     924 reconciler.go:297] Volume detached for volume "plex-tvshows-nfs-storage" (UniqueName: "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-tvshows-nfs-storage") on node "node05" DevicePath ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.375325     924 reconciler.go:297] Volume detached for volume "plex-token-fh9kp" (UniqueName: "kubernetes.io/secret/11dad472-e834-4114-be5f-91923ba01370-plex-token-fh9kp") on node "node05" DevicePath ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.375383     924 reconciler.go:297] Volume detached for volume "transcode-volume" (UniqueName: "kubernetes.io/empty-dir/11dad472-e834-4114-be5f-91923ba01370-transcode-volume") on node "node05" DevicePath ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.375421     924 reconciler.go:297] Volume detached for volume "plex-movies-nfs-storage" (UniqueName: "kubernetes.io/nfs/11dad472-e834-4114-be5f-91923ba01370-plex-movies-nfs-storage") on node "node05" DevicePath ""
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.476205     924 reconciler.go:203] operationExecutor.VerifyControllerAttachedVolume started for volume "plex-movies-nfs-storage" (UniqueName: "kubernetes.io/nfs/9b567d3f-3658-49dc-914f-bc49330885d5-plex-movies-nfs-storage") pod "plex-5fbbf6694f-4zxvb" (UID: "9b567d3f-3658-49dc-914f-bc49330885d5")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.476332     924 reconciler.go:203] operationExecutor.VerifyControllerAttachedVolume started for volume "plex-tvshows-nfs-storage" (UniqueName: "kubernetes.io/nfs/9b567d3f-3658-49dc-914f-bc49330885d5-plex-tvshows-nfs-storage") pod "plex-5fbbf6694f-4zxvb" (UID: "9b567d3f-3658-49dc-914f-bc49330885d5")
Aug 23 06:42:22 node05 kubelet[924]: I0823 06:42:22.476425     924 reconciler.go:203] operationExecutor.VerifyControllerAttachedVolume started for volume "plex-token-fh9kp" (UniqueName: "kubernetes.io/secret/9b567d3f-3658-49dc-914f-bc49330885d5-plex-token-fh9kp") pod "plex-5fbbf6694f-4zxvb" (UID: "9b567d3f-3658-49dc-914f-bc49330885d5")
Aug 23 06:42:22 node05 kubelet[924]: E0823 06:42:22.530987     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:24 node05 kubelet[924]: E0823 06:42:24.535019     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:26 node05 kubelet[924]: E0823 06:42:26.532067     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:27 node05 kubelet[924]: I0823 06:42:27.005602     924 operation_generator.go:629] MountVolume.WaitForAttach entering for volume "pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe" (UniqueName: "kubernetes.io/csi/rbd.csi.ceph.com^0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-28a66c73-bab2-11e9-903a-122b397f0010") pod "plex-5fbbf6694f-nrtl4" (UID: "11dad472-e834-4114-be5f-91923ba01370") DevicePath "csi-fce5695c9222907527b14bb6a5d7c4d8c866838bf7ea740c9f5f2c2d461b1c4f"
Aug 23 06:42:27 node05 kubelet[924]: I0823 06:42:27.011087     924 operation_generator.go:638] MountVolume.WaitForAttach succeeded for volume "pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe" (UniqueName: "kubernetes.io/csi/rbd.csi.ceph.com^0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-28a66c73-bab2-11e9-903a-122b397f0010") pod "plex-5fbbf6694f-nrtl4" (UID: "11dad472-e834-4114-be5f-91923ba01370") DevicePath "csi-fce5695c9222907527b14bb6a5d7c4d8c866838bf7ea740c9f5f2c2d461b1c4f"
Aug 23 06:42:27 node05 kubelet[924]: E0823 06:42:27.011416     924 csi_mounter.go:422] kubernetes.io/csi: isDirMounted IsLikelyNotMountPoint test failed for dir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount]
Aug 23 06:42:27 node05 kubelet[924]: E0823 06:42:27.011480     924 csi_attacher.go:296] kubernetes.io/csi: attacher.MountDevice failed while checking mount status for dir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount]
Aug 23 06:42:27 node05 kubelet[924]: E0823 06:42:27.011617     924 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/csi/rbd.csi.ceph.com^0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-28a66c73-bab2-11e9-903a-122b397f0010\"" failed. No retries permitted until 2019-08-23 06:44:29.011570614 +0000 UTC m=+1371867.633383435 (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe\" (UniqueName: \"kubernetes.io/csi/rbd.csi.ceph.com^0001-0024-9d9cc8f6-0843-46c3-8ca3-0309dff9978d-0000000000000003-28a66c73-bab2-11e9-903a-122b397f0010\") pod \"plex-5fbbf6694f-nrtl4\" (UID: \"11dad472-e834-4114-be5f-91923ba01370\") : stat /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount: input/output error"
Aug 23 06:42:28 node05 kubelet[924]: E0823 06:42:28.534004     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.115987     924 operation_generator.go:193] parsed scheme: ""
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.116065     924 operation_generator.go:193] scheme "" not registered, fallback to default scheme
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.116163     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.116204     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.116342     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0011b2210, CONNECTING
Aug 23 06:42:29 node05 kubelet[924]: I0823 06:42:29.116736     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc0011b2210, READY
Aug 23 06:42:29 node05 kubelet[924]: E0823 06:42:29.119227     924 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock" failed. No retries permitted until 2019-08-23 06:44:31.119170563 +0000 UTC m=+1371869.740985340 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock, err: rpc error: code = Unimplemented desc = unknown service pluginregistration.Registration"
Aug 23 06:42:30 node05 kubelet[924]: E0823 06:42:30.530197     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:32 node05 kubelet[924]: E0823 06:42:32.533030     924 kubelet_volumes.go:154] Orphaned pod "953ad598-9230-42a6-a571-943ec9f24dcb" found, but volume paths are still present on disk : There were a total of 6 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:34 node05 kubelet[924]: E0823 06:42:34.525588     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.118675     924 operation_generator.go:193] parsed scheme: ""
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.118738     924 operation_generator.go:193] scheme "" not registered, fallback to default scheme
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.118816     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.118847     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.118949     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, CONNECTING
Aug 23 06:42:35 node05 kubelet[924]: W0823 06:42:35.119143     924 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: connect: connection refused". Reconnecting...
Aug 23 06:42:35 node05 kubelet[924]: I0823 06:42:35.119215     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, TRANSIENT_FAILURE
Aug 23 06:42:36 node05 kubelet[924]: I0823 06:42:36.119490     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, CONNECTING
Aug 23 06:42:36 node05 kubelet[924]: W0823 06:42:36.119537     924 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: connect: connection refused". Reconnecting...
Aug 23 06:42:36 node05 kubelet[924]: I0823 06:42:36.119600     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, TRANSIENT_FAILURE
Aug 23 06:42:36 node05 kubelet[924]: E0823 06:42:36.531686     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:37 node05 kubelet[924]: I0823 06:42:37.703373     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, CONNECTING
Aug 23 06:42:37 node05 kubelet[924]: W0823 06:42:37.703443     924 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: connect: connection refused". Reconnecting...
Aug 23 06:42:37 node05 kubelet[924]: I0823 06:42:37.703540     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, TRANSIENT_FAILURE
Aug 23 06:42:38 node05 kubelet[924]: E0823 06:42:38.531042     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:39 node05 kubelet[924]: I0823 06:42:39.788900     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, CONNECTING
Aug 23 06:42:39 node05 kubelet[924]: W0823 06:42:39.788983     924 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: connect: connection refused". Reconnecting...
Aug 23 06:42:39 node05 kubelet[924]: I0823 06:42:39.789083     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, TRANSIENT_FAILURE
Aug 23 06:42:40 node05 kubelet[924]: E0823 06:42:40.525448     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:42 node05 kubelet[924]: E0823 06:42:42.529145     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:43 node05 kubelet[924]: W0823 06:42:43.825384     924 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: connect: connection refused". Reconnecting...
Aug 23 06:42:43 node05 kubelet[924]: I0823 06:42:43.825419     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, CONNECTING
Aug 23 06:42:43 node05 kubelet[924]: I0823 06:42:43.825598     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00093b150, TRANSIENT_FAILURE
Aug 23 06:42:44 node05 kubelet[924]: E0823 06:42:44.531607     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:45 node05 kubelet[924]: E0823 06:42:45.119000     924 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock" failed. No retries permitted until 2019-08-23 06:44:47.118949292 +0000 UTC m=+1371885.740762755 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- dial failed at socket /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock, err: failed to dial socket /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock, err: context deadline exceeded"
Aug 23 06:42:45 node05 kubelet[924]: W0823 06:42:45.119080     924 asm_amd64.s:1337] Failed to dial /var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock: context canceled; please retry.
Aug 23 06:42:46 node05 kubelet[924]: E0823 06:42:46.526835     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.880359     924 clientconn.go:440] parsed scheme: ""
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.880543     924 clientconn.go:440] scheme "" not registered, fallback to default scheme
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.880721     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.880778     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.880956     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00107ed90, CONNECTING
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.881022     924 clientconn.go:1016] blockingPicker: the picked transport is not ready, loop back to repick
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.881452     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00107ed90, READY
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889419     924 clientconn.go:440] parsed scheme: ""
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889521     924 clientconn.go:440] scheme "" not registered, fallback to default scheme
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889620     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889660     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889794     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00107f0a0, CONNECTING
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.889831     924 clientconn.go:1016] blockingPicker: the picked transport is not ready, loop back to repick
Aug 23 06:42:47 node05 kubelet[924]: I0823 06:42:47.890232     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00107f0a0, READY
Aug 23 06:42:48 node05 kubelet[924]: E0823 06:42:48.529132     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:50 node05 kubelet[924]: E0823 06:42:50.528107     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:52 node05 kubelet[924]: E0823 06:42:52.531756     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:54 node05 kubelet[924]: E0823 06:42:54.529109     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:56 node05 kubelet[924]: E0823 06:42:56.528255     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:42:58 node05 kubelet[924]: E0823 06:42:58.534116     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:00 node05 kubelet[924]: E0823 06:43:00.525000     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:02 node05 kubelet[924]: E0823 06:43:02.514816     924 kubelet.go:1665] Unable to mount volumes for pod "plex-5fbbf6694f-nrtl4_streaming(11dad472-e834-4114-be5f-91923ba01370)": timeout expired waiting for volumes to attach or mount for pod "streaming"/"plex-5fbbf6694f-nrtl4". list of unmounted volumes=[plex-config-persistent-storage transcode-volume tvshows-nfs-storage movies-nfs-storage plex-token-fh9kp]. list of unattached volumes=[plex-config-persistent-storage transcode-volume tvshows-nfs-storage movies-nfs-storage plex-token-fh9kp]; skipping pod
Aug 23 06:43:02 node05 kubelet[924]: E0823 06:43:02.515003     924 pod_workers.go:190] Error syncing pod 11dad472-e834-4114-be5f-91923ba01370 ("plex-5fbbf6694f-nrtl4_streaming(11dad472-e834-4114-be5f-91923ba01370)"), skipping: timeout expired waiting for volumes to attach or mount for pod "streaming"/"plex-5fbbf6694f-nrtl4". list of unmounted volumes=[plex-config-persistent-storage transcode-volume tvshows-nfs-storage movies-nfs-storage plex-token-fh9kp]. list of unattached volumes=[plex-config-persistent-storage transcode-volume tvshows-nfs-storage movies-nfs-storage plex-token-fh9kp]
Aug 23 06:43:02 node05 kubelet[924]: E0823 06:43:02.531155     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.524292     924 clientconn.go:440] parsed scheme: ""
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.524951     924 clientconn.go:440] scheme "" not registered, fallback to default scheme
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.525023     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.525053     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.525110     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00100e250, CONNECTING
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.525360     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00100e250, READY
Aug 23 06:43:04 node05 kubelet[924]: E0823 06:43:04.529875     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530321     924 clientconn.go:440] parsed scheme: ""
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530364     924 clientconn.go:440] scheme "" not registered, fallback to default scheme
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530419     924 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock 0  <nil>}]
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530445     924 clientconn.go:796] ClientConn switching balancer to "pick_first"
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530529     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00100e5f0, CONNECTING
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530562     924 clientconn.go:1016] blockingPicker: the picked transport is not ready, loop back to repick
Aug 23 06:43:04 node05 kubelet[924]: I0823 06:43:04.530748     924 balancer_conn_wrappers.go:131] pickfirstBalancer: HandleSubConnStateChange: 0xc00100e5f0, READY
Aug 23 06:43:06 node05 kubelet[924]: E0823 06:43:06.531037     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:08 node05 kubelet[924]: E0823 06:43:08.525101     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:10 node05 kubelet[924]: E0823 06:43:10.528091     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.
Aug 23 06:43:12 node05 kubelet[924]: E0823 06:43:12.529265     924 kubelet_volumes.go:154] Orphaned pod "11dad472-e834-4114-be5f-91923ba01370" found, but volume paths are still present on disk : There were a total of 7 errors similar to this. Turn up verbosity to see them.

Steps to reproduce

I do not know how to reproduce this sadly. All my other volumes do not experience this behavior. I hope the logs show something ๐Ÿ˜ƒ The volume is still in this state however, when I delete the pod to get it rescheduled it still does not work so I can reproduce it in my cluster over and over again.

I think there is an issue here:

Aug 23 06:42:27 node05 kubelet[924]: E0823 06:42:27.011416     924 csi_mounter.go:422] kubernetes.io/csi: isDirMounted IsLikelyNotMountPoint test failed for dir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount]
Aug 23 06:42:27 node05 kubelet[924]: E0823 06:42:27.011480     924 csi_attacher.go:296] kubernetes.io/csi: attacher.MountDevice failed while checking mount status for dir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe/globalmount]

Then it does not release the volume and kubelet starts to complain about orphaned volumes. When the pod is rescheduled on the same node it fail to mount since the previous volume is still attached (?).

Actual results

The pod is stuck on ContainerCreating forever because the volume cannot be mounted.

Expected behavior

The pod to start up with the volume nicely mounted.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 18 (10 by maintainers)

Most upvoted comments

@Madhu-1 Thanks for the pointer! This issue now is โ€œresolvedโ€ by using the kernel mounter, running stable for more than a month now.

Should I leave this open since there for sure is an issue with the rbd-nbd?

@wilmardo the issue with fuse or nbd here is a known one and we already have some trackers for it. Considering the same I would like to close this issue.

@vvavepacket I would suggest to open a new issue if you are facing any.

Closing this for now.

Found a workaround for anyone with the same problem:

  • Login to the node where the pod is scheduled
  • Take note of the pvc path in the describe (eg. pvc-9174936d-f9d0-493b-87f2-49f1fb5835fe)
  • See which block device is mounted to that path with lsblk
  • umount this block device sudo umount /dev/nbd2
  • umount this block device once more sudo umount /dev/nbd2
  • Check lsblk to see that device has no mountpoint anymore
  • Restart kubelet systemctl restart kubelet (otherwise the missing volume does not seem noted)
  • Force a recreate of the pod kubectl delete pod <podname>

This will fix the issue for some time, I will migrate to ext4 hopefully soon and update this issue to see if it resolves ๐Ÿ˜ƒ

btw in the latest canary image, we have fixed some issues related to the mounting failure.

I will try this later today, right now I am working on the Helm chart ๐Ÿ˜ƒ