kubernetes: Volume stuck in attaching state when using multiple PersistentVolumeClaim
I’m using Kube 1.4.5, and AWS storage.
When I try to attach multiple volumes using PVC’s, one of the volumes consistently gets stuck in the attaching state whilst the other is successful.
Below are the definitions that I used.
sonar-persistence.yml
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: sonarqube-data
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
volumeID: aws://eu-west-1a/vol-XXXXXXX
fsType: ext4
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: sonarqube-extensions
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
volumeID: aws://eu-west-1a/vol-XXXXXX
fsType: ext4
sonar-claim.yml
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: sonarqube-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: sonarqube-extensions
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
sonar-deployment.yml
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: sonar
spec:
replicas: 1
template:
metadata:
name: sonar
labels:
name: sonar
spec:
containers:
- image: sonarqube:lts
args:
- -Dsonar.web.context=/sonar
name: sonar
env:
- name: SONARQUBE_JDBC_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-pwd
key: password
- name: SONARQUBE_JDBC_URL
value: jdbc:postgresql://sonar:5432/sonar
ports:
- containerPort: 9000
name: sonar
volumeMounts:
- name: sonarqube-data
mountPath: /opt/sonarqube/data
- name: sonarqube-extensions
mountPath: /opt/sonarqube/extensions
volumes:
- name: sonarqube-data
persistentVolumeClaim:
claimName: sonarqube-data
- name: sonarqube-extensions
persistentVolumeClaim:
claimName: sonarqube-extensions
The data volume always appears to be successful, and maybe coincidentally is first in the list. I have tried this multiple times but the result is always the same.
The error message is as follows:
Unable to mount volumes for pod "sonar-3504269494-tnzwo_default(2cc5292c-a5d4-11e6-bd99-0a82a8a86ebf)": timeout expired waiting for volumes to attach/mount for pod "sonar-3504269494-tnzwo"/"default". list of unattached/unmounted volumes=[sonarqube-data sonarqube-extensions]
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "sonar-3504269494-tnzwo"/"default". list of unattached/unmounted volumes=[sonarqube-data sonarqube-extensions]
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 21 (8 by maintainers)
We get this frequently with EBS PVC/PV voumes and see plenty of similar reports. It usually starts when recreating a Pod (“Recreate” Strategy). The old Pod is torn down and the PV unmounted, then the PV is mounted on the new Pod (on same or different worker). It seems like quick unmount/mount can trigger the ‘stuck attaching’ issue, which AWS blames on reusing device names (or reusing them too quickly maybe): https://aws.amazon.com/premiumsupport/knowledge-center/ebs-stuck-attaching/
A temporary fix is to tell AWS for force detach the EBS volume, then wait, the new Pod will attach and recover with a few minutes. However next time you recreate that particular Pod you also most certainly get the same stuck problem; Once an instance+PV combo start doing this it seems to happen almost every time. The only long term fix I have found/seen is to reboot the worker node or to delete and recreate the PVC/PV.
It is a major hassle and we’re looking to switch away from EBS to something more reliable for mounting, like EFS, NFS or GlusterFS.
I wondered about scaling the deployment to 0 instances first, waiting a while before redeploying. Not an attractive option though.
@saad-ali no need to apologize, even before k8s the ‘stuck attaching’ was a known EBS condition (hence the AWS FAQ). It was just that before k8s it came up less often because it was much less common to be unmounting and remounting EBS volumes between instances every few minutes when a k8s CD deployment happens 😃
Thanks for the tip about the upcoming patches by @justinsb and @jingxu97. We create clusters using coreos kube-aws and latest release is 1.4.3 and master is 1.4.6 I think. Might test a 1.4.6 cluster if I can.
I see AWS EFS or similar as a more natural fit for smallish disk volumes for k8s anyway,
Unfortunately EFS is taking its own sweet time to get to the southern hemisphere, like Java committee process slow 😛