longhorn: [BUG] GET error for volume attachment on node reboot
Describe the bug
After a reboot of a node in a 4 node cluster a user is seeing the following:
Warning FailedMount 48s (x3 over 4m52s) kubelet MountVolume.WaitForAttach failed for volume "pvc-7d2e2124-4b0c-4d79-890a-fcee02a185a1" : volume pvc-7d2e2124-4b0c-4d79-890a-fcee02a185a1 has GET error for volume attachment csi-b21170ee9729a55ec3e64e6bd4ed0a11ac70ac2272e0e3b7bb3f6fdeac262172: volumeattachments.storage.k8s.io "csi-b21170ee9729a55ec3e64e6bd4ed0a11ac70ac2272e0e3b7bb3f6fdeac262172" not found
To recover, the user had to create the volumeattachment object manually for the Pod to mount its storage again
To Reproduce
I have not been able to reproduce this yet unfortunately
Expected behavior
A pod can successfully mount its storage despite a node reboot in the cluster
Log or Support bundle
longhorn-support-bundle_a8118729-480f-4d38-9b91-26a755d2e0cc_2022-06-28T20-34-47Z.zip
Environment
- Longhorn version: 1.1.2
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): kubectl(kURL addon - https://kurl.sh/docs/add-ons/longhorn)
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: https://kurl.sh/docs/install-with-kurl/
- Number of management node in the cluster: 3
- Number of worker node in the cluster: 1
- Node config
- OS type and version: Red Hat Enterprise Linux Server 7.9 (Maipo)
- CPU per node: 8
- Memory per node: 64GB
- Disk type(e.g. SSD/NVMe): (Unsure, but can gather this info if needed)
- Network bandwidth between the nodes: (Unsure, but can gather this info if needed)
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): (Unsure, but can gather this info if needed)
- Number of Longhorn volumes in the cluster: 5
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 22 (12 by maintainers)
@diamonwiggins @hedefalk @simonreddy2001 @Orhideous @rajivml I tried to reproduce the issue using Longhorn v1.3.2 and a StatefulSet with 2 replicas on a 2-node cluster. Reboot the two nodes repeatedly, but still cannot reproduce the issue.
Could you please provide the reproducing steps? If you run into the issue again, could you provide a support bundle as well? Thanks.
Hi I have same issue
MountVolume.WaitForAttach failed for volume “pvc-xx” : volume vol-xx has GET error for volume attachment csi-xx: volumeattachments.storage.k8s.io “csi-xx” is forbidden: User “system:node:ip-xx.compute.internal” cannot get resource “volumeattachments” in API group “storage.k8s.io” at the cluster scope: no relationship found between node ‘ip-xx.compute.internal’ and this object
But we scaledown the statefulset replicas to 0 and scale it back such that volume attachment flow gets triggered again
Next action
Even though we are not able to reproduce the upstream issue, from code analysis, I do think that the race condition in the upstream issue COULD be the root cause of this ticket. The upstream issue is fixed in:
Therefore, I think the next step for this ticket would be:
WDYT @derekbit @innobead @ejweber ?
Workaround:
Additionally, from code analysis, I think the workaround may be to scale down the workload, wait for the workload to be fully terminated, then scale back the workload again. Kube-controller-manager should be able to recreate the VolumeAttachment for the new pod
HI @PhanLe1010
We are seeing it on both single node and multi-node environments
I will share an environment for your offline analysis via DM over slack
Longhorn version: 1.3.1 Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Helm Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE2 Number of management node in the cluster: 3 nodes which acts as both master + worker Number of worker node in the cluster: 3 nodes which acts as both master + worker Node config: 32 Core 128GB RAM OS type and version: RHEL CPU per node: 32 Memory per node: 128GB RAM Disk type(e.g. SSD/NVMe): SSD Network bandwidth between the nodes: Azure Provided Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Azure Disks Number of Longhorn volumes in the cluster: Around 20
we see this issue so often with longhorn and today also we had a repro where on node restart in a multi-node environment alertmanager statefulset related pods were not able to mount PVCs even after 30-40 minutes and we see this issue with both deployments and statefulsets
This is happening with longhorn 1.3.1 also and this repro is with longhorn 1.3.1 itself
When ever this happens we scaledown the workload replicas to 0 and scale it back such that volume attachment flow gets triggered again but this is not an acceptable solution while running production workloads
@PhanLe1010 The node went down for only minutes. Maybe 5 minutes or so. However it was a full 24 hours before the user manually created the VolumeAttachment objects.
Hi @diamonwiggins, ref:issues-2629 Did it ‘FailedMount’ happen for a long time before you created the volumeattachment object?