longhorn: [BUG] Unable to attach or mount volumes: unmounted volumes=[volv], unattached volumes=[volv kube-api-access-4tqrk]: timed out waiting for the condition (duplicated default IM-R)
Describe the bug
A clear and concise description of what the bug is.
To Reproduce
Steps to reproduce the behavior:
- i did operation on cluster like this create-volumes
- it showed error
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m45s default-scheduler Successfully assigned default/volume-test to release-worker01
Warning FailedMount 43s kubelet Unable to attach or mount volumes: unmounted volumes=[volv], unattached volumes=[volv kube-api-access-4tqrk]: timed out waiting for the condition
Warning FailedAttachVolume 5s (x8 over 73s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-de440081-ecb8-4ab2-b5da-9aea75b19003" : rpc error: code = DeadlineExceeded desc = volume pvc-de440081-ecb8-4ab2-b5da-9aea75b19003 failed to attach to node release-worker01
Expected behavior
i hope it can be successful A clear and concise description of what you expected to happen.
Log or Support bundle
kubectl describe pod volume-test
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m45s default-scheduler Successfully assigned default/volume-test to release-worker01
Warning FailedMount 43s kubelet Unable to attach or mount volumes: unmounted volumes=[volv], unattached volumes=[volv kube-api-access-4tqrk]: timed out waiting for the condition
Warning FailedAttachVolume 5s (x8 over 73s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-de440081-ecb8-4ab2-b5da-9aea75b19003" : rpc error: code = DeadlineExceeded desc = volume pvc-de440081-ecb8-4ab2-b5da-9aea75b19003 failed to attach to node release-worker01
---
daemonset.apps/longhorn-environment-check created
waiting for pods to become ready (0/3)
waiting for pods to become ready (0/3)
waiting for pods to become ready (0/3)
waiting for pods to become ready (0/3)
waiting for pods to become ready (0/3)
all pods ready (3/3)
MountPropagation is enabled!
cleaning up...
daemonset.apps "longhorn-environment-check" deleted
clean up complete
---
[root@release-master engine-binaries]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
sopei-log 4Gi RWX Retain Bound sopei-biz/sopei-log sopei-log 8d
pvc-de440081-ecb8-4ab2-b5da-9aea75b19003 2Gi RWO Delete Bound default/longhorn-volv-pvc longhorn 7m53s
[root@release-master engine-binaries]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
longhorn-volv-pvc Bound pvc-de440081-ecb8-4ab2-b5da-9aea75b19003 2Gi RWO longhorn 7m58s

Environment
- Longhorn version:
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: K3s
- Number of management node in the cluster: 1.21.7+k3s1
- Number of worker node in the cluster: 3
- Node config
- OS type and version: centos7.9
- CPU per node: 4
- Memory per node: 8g
- Disk type(e.g. SSD/NVMe): NVMe
- Network bandwidth between the nodes:
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): tencent cloud
- Number of Longhorn volumes in the cluster: 3
Additional context
Add any other context about the problem here. longhorn-support-bundle
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 1
- Comments: 43 (17 by maintainers)
The mkfs.xfs in Longhorn csi plugin is the newer version, mkfs.xfs version 5.3.0. The filesystem created by this version will not work on RHEL 7 by default.
There is manually workaround for this but the best solution would be just ask user to upgrade to CentOS 8. There is another benefit of this approach is that we can avoid this slowness in the older kernel version https://github.com/longhorn/longhorn/issues/2640
Ref: https://github.com/ceph/ceph-csi/issues/966#issuecomment-620655796
I had the Centos 7. 9 and i chenged to debian 11.4 and added nfs-common package
Node release-worker01 doesn’t have enough available space:
Please do:
nodestaprelease-worker01release-worker-01/var/lib/longhorn/replicasnodestaprelease-worker01After checked the support bundle, we see that:
2022-03-18T14:01:30.381993257+08:00 E0318 06:01:30.381849 1 replica_controller.go:201] fail to sync replica for longhorn-system/pvc-fd6f5ab7-0f18-4c4f-b099-8c7afd23ccba-r-4a7b6a1b: failed to get instance manager for instance pvc-fd6f5ab7-0f18-4c4f-b099-8c7afd23ccba-r-4a7b6a1b: can not find the only available instance manager for instance pvc-fd6f5ab7-0f18-4c4f-b099-8c7afd23ccba-r-4a7b6a1b, node release-worker01, instance manager image rancher/mirrored-longhornio-longhorn-instance-manager:v1_20211210, type replicarelease-worker01Workaround: delete one of the duplicated instance manager
kubectl delete instancemanagers instance-manager-r-84356b81 -n longhorn-systemRef: This is related to the issue https://github.com/longhorn/longhorn/issues/3000
Got into the same issue. I resolved it by killing the associated
VolumeAttachmentand then restart the pod.i resolved my problem @PhanLe1010