kubernetes: Ceph RBD provisionner "timeout expired waiting for volumes to attach/mount for pod"

/kind bug

What happened:

The RBD storageClass and PVC are working as expected (bound together, pvc provisionned, can see the created images on the ceph pool). When i try to use the volumes in a statefulset i get the following error

    Warning  FailedMount            14s (x3 over 4m)  kubelet, node2     Unable to mount volumes for pod "mysql-0_default(2c4858f6-2dd1-11e8-9506-00505695ceef)": timeout expired waiting for volumes to attach/mount for pod "default"/"mysql-0". list of unattached/unmounted volumes=[db]

The error is a catch all so kubelet logs are included below

What you expected to happen:

For the pod to mount its volume properly.

How to reproduce it (as minimally and precisely as possible): I use the following yaml files to deploy ressources
StorageClass

kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
  name: cephrbd
  annotations:
     storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/rbd
parameters:
    monitors: 172.29.1.138:6789, 172.29.1.139:6789, 172.29.1.140:6789
    adminId: admin
    adminSecretName: ceph-secret
    pool: kube
    fsType: ext4
    userId: kube
    userSecretName: ceph-user-secret

first PVC (2 others similar)

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
 name: db-mysql-0
spec:
 accessModes:
   - ReadWriteOnce
 resources:
   requests:
     storage: 1Gi

StatefulSet

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mysql
  labels:
    name: mysql  
spec:
  serviceName: "mysql"
  replicas: 3
  template:
    metadata:
      labels:
        name: mysql
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mysql
        image: perconalab/percona-xtradb-cluster:5.6
        ports:
        - containerPort: 3306
          name: mysql
        env:
          - name: MYSQL_ROOT_PASSWORD
            value: k8spassword
          - name: DISCOVERY_SERVICE
            value: etcd:2379
          - name: XTRABACKUP_PASSWORD
            value: k8spassword
          - name: CLUSTER_NAME
            value: percona
        volumeMounts:
        - name: db
          mountPath: /var/lib/mysql 
  volumeClaimTemplates:
  - metadata:
      name: db
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Anything else we need to know?: Looking in the kubelet logs i get the following lines :

Mar 22 14:01:29 node2 kubelet[2046]: I0322 14:01:29.115165    2046 reconciler.go:262] operationExecutor.MountVolume started for volume "pvc-52ea13b0-2dd9-11e8-9506-00505695ceef" (UniqueName: "kubernetes.io/rbd/[172.29.1.138:6789  172.29.1.139:6789  172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef") pod "mysql-0" (UID: "58b6eaae-2dd9-11e8-9506-00505695ceef")
Mar 22 14:01:29 node2 kubelet[2046]: I0322 14:01:29.115825    2046 operation_generator.go:446] MountVolume.WaitForAttach entering for volume "pvc-52ea13b0-2dd9-11e8-9506-00505695ceef" (UniqueName: "kubernetes.io/rbd/[172.29.1.138:6789  172.29.1.139:6789  172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef") pod "mysql-0" (UID: "58b6eaae-2dd9-11e8-9506-00505695ceef") DevicePath ""
Mar 22 14:01:29 node2 kubelet[2046]: E0322 14:01:29.402591    2046 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[172.29.1.138:6789  172.29.1.139:6789  172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef\"" failed. No retries permitted until 2018-03-22 14:02:33.402150246 +0000 UTC m=+245312.690283744 (durationBeforeRetry 1m4s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-52ea13b0-2dd9-11e8-9506-00505695ceef\" (UniqueName: \"kubernetes.io/rbd/[172.29.1.138:6789  172.29.1.139:6789  172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef\") pod \"mysql-0\" (UID: \"58b6eaae-2dd9-11e8-9506-00505695ceef\") : error: exit status 1, rbd output: 2018-03-22 14:01:29.382939 7f0c5e8a1100 -1 did not load config file, using default settings.\nrbd: error opening image kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef: (1) Operation not permitted\n2018-03-22 14:01:29.398427 7f0c2ffff700 -1 librbd::image::OpenRequest: failed to retrieve image id: (1) Operation not permitted\n2018-03-22 14:01:29.398483 7f0c2f7fe700 -1 librbd::ImageState: 0x55825945e0c0 failed to open image: (1) Operation not permitted\n"

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3+coreos.0", GitCommit:"f588569ed1bd4a6c986205dd0d7b04da4ab1a3b6", GitTreeState:"clean", BuildDate:"2018-02-10T01:42:55Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
  • Kernel (e.g. uname -a):
Linux node2 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: Installed the cluster using kubespray on Vsphere virtual machines
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 7
  • Comments: 27 (3 by maintainers)

Most upvoted comments

I will check 1.9.2

I experienced a very similar issue on k8s version 1.10. However, it wasn’t isolated to stateful sets. I was unable to mount any rbd persistent volumes. I downgraded to 1.9.2 and everything worked again.