csi-driver: Volume assigning step has failed due to an unknown error
Kubernetes version: 1.13.1 Ubuntu version: 18.04 One master node, two worker nodes (all CX21) Using the provided example in the README
The volume gets created as I can see it in the Hetzner Cloud dashboard, but it isn’t attached to a server.
Name: my-csi-app
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: worker2
Start Time: Mon, 14 Jan 2019 13:39:28 +0100
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"my-csi-app","namespace":"default"},"spec":{"containers":[{"command":[...
Status: Pending
IP:
Containers:
my-frontend:
Container ID:
Image: busybox
Image ID:
Port: <none>
Host Port: <none>
Command:
sleep
1000000
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/data from my-csi-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-9pq7v (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
my-csi-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: csi-pvc
ReadOnly: false
default-token-9pq7v:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-9pq7v
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 104s (x4 over 110s) default-scheduler pod has unbound immediate PersistentVolumeClaims (repeated 3 times)
Normal Scheduled 104s default-scheduler Successfully assigned default/my-csi-app to worker2
Warning FailedAttachVolume 37s (x8 over 101s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-6b5bd076-17f9-11e9-a20a-9600001463bf" : rpc error: code = Internal desc = failed to publish volume: Volume assigning step has failed due to an unknown error. (unknown_error)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 17 (2 by maintainers)
I had a look at your failed attach API call and it turned out it wasn’t a problem related to automount. It was another problem which was fixed by you rebooting your server.
@djboris9 Got the same answer from Hetzner. But sometimes just rebooting isn’t enough. Then detaching/re-attaching of volumes in the web gui is required.
@shyblower can confirm this. In the documentation is written, that you can attach up to 16 volumes to a server. However it feels like you just can attach a volume 16 times to a server. The more volumes you have and the more volumes you create/destroy by the hcloud-csi-driver, the more likely you ran into this problem.
Following helped me:
hcloud-csi-driver 1.6.0
I am getting this more and more often. It’s especially easy to trigger if you have a stateful-set with 4+ members and
podManagementPolicy = "Parallel".Is there any way to find out what’s causing this? Because it easily breaks deployments.
Mounting fails also show up in the console:
Update Rebooting the servers solves the issue. Support also told me to do that. It’s not a very nice solution though.