csi-driver: CSI Mount Error on Hetzner
Hi all,
hope somebody has a hint for me as I already invested hours without a solution, in the beginning everything with my kuberone cluster on Hetzner worked as expected. auto provisioning of block storage over the csi interface is exactly as it should. Since days across the different namespaces errors occur at initial pod starting after provisioning via helm
the error message at pod startup
MountVolume.MountDevice failed for volume "pvc-a0a20263-9462-4c3e-9ebb-be45b92da7f4" : rpc error: code = Internal desc = failed to stage volume: format of disk "/dev/disk/by-id/scsi-0HC_Volume_16943979" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-a0a20263-9462-4c3e-9ebb-be45b92da7f4/globalmount") options:("defaults") errcode:(exit status 1) output:(mke2fs 1.45.7 (28-Jan-2021) The file /dev/disk/by-id/scsi-0HC_Volume_16943979 does not exist and no size was specified. )
Unable to attach or mount volumes: unmounted volumes=[redis-pvc], unattached volumes=[redis-pvc default-token-g7mjq]: timed out waiting for the condition
parts of the deployment file
spec:
volumes:
- name: redis-pvc
persistentVolumeClaim:
claimName: redis-pvc-chirpstack-redis-0
- name: default-token-g7mjq
secret:
secretName: default-token-g7mjq
defaultMode: 420
volumeMounts:
- name: redis-pvc
mountPath: /data
- name: default-token-g7mjq
readOnly: true
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
pvc is like this
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-pvc-chirpstack-redis-0
namespace: chirpstack
selfLink: >-
/api/v1/namespaces/chirpstack/persistentvolumeclaims/redis-pvc-chirpstack-redis-0
uid: 70520ae2-b99f-4d7f-a625-5e97d1748dd9
resourceVersion: '2367972'
creationTimestamp: '2022-02-15T17:44:04Z'
labels:
app: redis
release: chirpstack
annotations:
pv.kubernetes.io/bind-completed: 'yes'
pv.kubernetes.io/bound-by-controller: 'yes'
volume.beta.kubernetes.io/storage-provisioner: csi.hetzner.cloud
volume.kubernetes.io/selected-node: oc4-pool1-779fc8f494-whkcs
finalizers:
- kubernetes.io/pvc-protection
managedFields:
- manager: kube-scheduler
operation: Update
apiVersion: v1
time: '2022-02-15T17:44:04Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:volume.kubernetes.io/selected-node: {}
- manager: kube-controller-manager
operation: Update
apiVersion: v1
time: '2022-02-15T17:44:07Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:pv.kubernetes.io/bind-completed: {}
f:pv.kubernetes.io/bound-by-controller: {}
f:volume.beta.kubernetes.io/storage-provisioner: {}
f:labels:
.: {}
f:app: {}
f:release: {}
f:spec:
f:accessModes: {}
f:resources:
f:requests:
.: {}
f:storage: {}
f:storageClassName: {}
f:volumeMode: {}
f:volumeName: {}
f:status:
f:accessModes: {}
f:capacity:
.: {}
f:storage: {}
f:phase: {}
status:
phase: Bound
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500M
volumeName: pvc-70520ae2-b99f-4d7f-a625-5e97d1748dd9
storageClassName: hcloud-volumes
volumeMode: Filesystem
pv
apiVersion: v1
kind: PersistentVolume
metadata:
name: pvc-70520ae2-b99f-4d7f-a625-5e97d1748dd9
selfLink: /api/v1/persistentvolumes/pvc-70520ae2-b99f-4d7f-a625-5e97d1748dd9
uid: cc6f8d63-a75f-4954-88d3-4ad5c68ffbc1
resourceVersion: '2367987'
creationTimestamp: '2022-02-15T17:44:07Z'
annotations:
pv.kubernetes.io/provisioned-by: csi.hetzner.cloud
finalizers:
- kubernetes.io/pv-protection
- external-attacher/csi-hetzner-cloud
managedFields:
- manager: csi-provisioner
operation: Update
apiVersion: v1
time: '2022-02-15T17:44:07Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:pv.kubernetes.io/provisioned-by: {}
f:spec:
f:accessModes: {}
f:capacity:
.: {}
f:storage: {}
f:claimRef:
.: {}
f:apiVersion: {}
f:kind: {}
f:name: {}
f:namespace: {}
f:resourceVersion: {}
f:uid: {}
f:csi:
.: {}
f:driver: {}
f:fsType: {}
f:volumeAttributes:
.: {}
f:storage.kubernetes.io/csiProvisionerIdentity: {}
f:volumeHandle: {}
f:nodeAffinity:
.: {}
f:required:
.: {}
f:nodeSelectorTerms: {}
f:persistentVolumeReclaimPolicy: {}
f:storageClassName: {}
f:volumeMode: {}
- manager: kube-controller-manager
operation: Update
apiVersion: v1
time: '2022-02-15T17:44:07Z'
fieldsType: FieldsV1
fieldsV1:
f:status:
f:phase: {}
- manager: csi-attacher
operation: Update
apiVersion: v1
time: '2022-02-15T17:44:08Z'
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
v:"external-attacher/csi-hetzner-cloud": {}
status:
phase: Bound
spec:
capacity:
storage: 10Gi
csi:
driver: csi.hetzner.cloud
volumeHandle: '16943978'
fsType: ext4
volumeAttributes:
storage.kubernetes.io/csiProvisionerIdentity: 1644479434542-8081-csi.hetzner.cloud
accessModes:
- ReadWriteOnce
claimRef:
kind: PersistentVolumeClaim
namespace: chirpstack
name: redis-pvc-chirpstack-redis-0
uid: 70520ae2-b99f-4d7f-a625-5e97d1748dd9
apiVersion: v1
resourceVersion: '2367918'
persistentVolumeReclaimPolicy: Delete
storageClassName: hcloud-volumes
volumeMode: Filesystem
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: csi.hetzner.cloud/location
operator: In
values:
- nbg1
this messages happen across different deployments with different applications / services.
thx a lot Martin
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 6
- Comments: 26 (3 by maintainers)
Commits related to this issue
- fix(node): check for empty devicePath When the VolumeAttachment was created with v1.6.0 (or older) it has an empty publish context and we run into a cryptic error during mount. We should always check... — committed to hetznercloud/csi-driver by apricote 2 years ago
- fix(node): check for empty devicePath When the VolumeAttachment was created with v1.6.0 (or older) it has an empty publish context and we run into a cryptic error during mount. We should always check... — committed to hetznercloud/csi-driver by apricote 2 years ago
- fix(node): check for empty devicePath (#344) When the VolumeAttachment was created with v1.6.0 (or older) it has an empty publish context and we run into a cryptic error during mount. We should alw... — committed to hetznercloud/csi-driver by apricote 2 years ago
Probably it will help somebody: kubeone works as solution, but I cannot forget this problem, because “vitobotta/hetzner-k3s project” was my step into world of Kubernetes. So I investigate the problem a little bit more.
I tried with node restarts, but not helps. Then I manually remove the VolumeAttachment entry of PVC and this solves the situation automatically. I suppose the problem is based within the code, which check, if a VolumeAttachment must be removed/reinitialized. I will try, if I can debug CSI Driver a little bit, but probably this is beyond my knowledge at the moment.
So the manual workaround for me, was:
This issue has been marked as stale because it has not had recent activity. The bot will close the issue if no further action occurs.
I spent some more time investigating the problems in this issue. I think this issue contains two different problems:
The original problem, where the volume was supposed to be attached according to the api and
NodePublishVolumefails withUnfortunately I could not find a cause for this problem (yet), it would be great if all of the affected users could upgrade to the latest version of the driver and then send me debug logs (see here for debug logs).
The second problem, where the publishing fails because the
NodePublishVolumecall is missing a path to the device. This fails with message:As far as I could tell this happened:
v1.6.0v2.0.0+(orlatestat some point after 2022-02-15)We made a change (#264) in the way we get the linux device path of the volume (which is used in
NodePublishVolume). Prior to the linked PR, we retrieved the volume from the API inNodeStageVolumeand used that path to mount the volume. In an effort to remove any API calls from theNodepart, we changed this mechanism to instead pass the device path fromControllerPublishVolumetoNodePublishVolumeusingPublishContext. ThisPublishContextis saved on theVolumeAttachmentobject in kubernetes.As we only started setting the
PublishContextfor VolumeAttachments created afterv2.0.0(orlatestat some point after 2022-02-15), the fields are missing for older VolumeAttachments.You can run the following command to find any lingering VolumeAttachments in your clusters that might still be affected by this:
If you see any lines with
<none>in the DEVICEPATH column, you need to recreate the VolumeAttachment, the workaround from @swarnat works for this.Do you have any updates?
We are also facing this issue. It occurs any time a server has been crashed after trying to add a new pvc. Then we restart the server in the Hetzner Cloud Console. When the server has been started again, this new pvc is listet correctly in the Hetzner Cloud Console and in Kubernetes, but the pvc does not exists in
/dev/disk/by-id/.The same issue came up for me today after upgrading the csi-driver from
1.6.0to2.1.0according to the upgrade guide.@apricote’s command to find affected attachments was very helpful. It is important to note though that the corresponding pod(s) should be immediately restarted (deleted) afterwards as pv access is lost directly after deleting the faulty volume attachments as described by @swarnat.
Thanks everyone for investigating this!
Every day an interesting project. Thanks for mentioning kubeone. I never heard from that tool and will check, what we need to adjust to setup vanilla k8s ourself.