kubernetes: Volume stuck in attaching state when using multiple PersistentVolumeClaim

I’m using Kube 1.4.5, and AWS storage.

When I try to attach multiple volumes using PVC’s, one of the volumes consistently gets stuck in the attaching state whilst the other is successful.

Below are the definitions that I used.

sonar-persistence.yml


---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: sonarqube-data
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  awsElasticBlockStore:
    volumeID: aws://eu-west-1a/vol-XXXXXXX
    fsType: ext4

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: sonarqube-extensions
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteOnce
  awsElasticBlockStore:
    volumeID: aws://eu-west-1a/vol-XXXXXX
    fsType: ext4

sonar-claim.yml


---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: sonarqube-data
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: sonarqube-extensions
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

sonar-deployment.yml


---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: sonar
spec:
  replicas: 1
  template:
    metadata:
      name: sonar
      labels:
        name: sonar
    spec:
      containers:
        - image: sonarqube:lts
          args:
            - -Dsonar.web.context=/sonar
          name: sonar
          env:
            - name: SONARQUBE_JDBC_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-pwd
                  key: password
            - name: SONARQUBE_JDBC_URL
              value: jdbc:postgresql://sonar:5432/sonar
          ports:
            - containerPort: 9000
              name: sonar
          volumeMounts:
          - name: sonarqube-data
            mountPath: /opt/sonarqube/data
          - name: sonarqube-extensions
            mountPath: /opt/sonarqube/extensions
      volumes:
        - name: sonarqube-data
          persistentVolumeClaim:
            claimName: sonarqube-data
        - name: sonarqube-extensions
          persistentVolumeClaim:
            claimName: sonarqube-extensions

The data volume always appears to be successful, and maybe coincidentally is first in the list. I have tried this multiple times but the result is always the same.

The error message is as follows:

Unable to mount volumes for pod "sonar-3504269494-tnzwo_default(2cc5292c-a5d4-11e6-bd99-0a82a8a86ebf)": timeout expired waiting for volumes to attach/mount for pod "sonar-3504269494-tnzwo"/"default". list of unattached/unmounted volumes=[sonarqube-data sonarqube-extensions]
Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "sonar-3504269494-tnzwo"/"default". list of unattached/unmounted volumes=[sonarqube-data sonarqube-extensions]

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 21 (8 by maintainers)

Most upvoted comments

We get this frequently with EBS PVC/PV voumes and see plenty of similar reports. It usually starts when recreating a Pod (“Recreate” Strategy). The old Pod is torn down and the PV unmounted, then the PV is mounted on the new Pod (on same or different worker). It seems like quick unmount/mount can trigger the ‘stuck attaching’ issue, which AWS blames on reusing device names (or reusing them too quickly maybe): https://aws.amazon.com/premiumsupport/knowledge-center/ebs-stuck-attaching/

A temporary fix is to tell AWS for force detach the EBS volume, then wait, the new Pod will attach and recover with a few minutes. However next time you recreate that particular Pod you also most certainly get the same stuck problem; Once an instance+PV combo start doing this it seems to happen almost every time. The only long term fix I have found/seen is to reboot the worker node or to delete and recreate the PVC/PV.

It is a major hassle and we’re looking to switch away from EBS to something more reliable for mounting, like EFS, NFS or GlusterFS.

I wondered about scaling the deployment to 0 instances first, waiting a while before redeploying. Not an attractive option though.

+11

whereisaaron on Nov 23, 2016

@saad-ali no need to apologize, even before k8s the ‘stuck attaching’ was a known EBS condition (hence the AWS FAQ). It was just that before k8s it came up less often because it was much less common to be unmounting and remounting EBS volumes between instances every few minutes when a k8s CD deployment happens 😃

Thanks for the tip about the upcoming patches by @justinsb and @jingxu97. We create clusters using coreos kube-aws and latest release is 1.4.3 and master is 1.4.6 I think. Might test a 1.4.6 cluster if I can.

I see AWS EFS or similar as a more natural fit for smallish disk volumes for k8s anyway,

no need to decide/estimate volume sizes ahead of time or resize later
you can multiply mount, allowing more options for rolling deployments.

Unfortunately EFS is taking its own sweet time to get to the southern hemisphere, like Java committee process slow 😛

whereisaaron on Nov 23, 2016