kubernetes: Ceph RBD provisionner "timeout expired waiting for volumes to attach/mount for pod"
/kind bug
What happened:
The RBD storageClass and PVC are working as expected (bound together, pvc provisionned, can see the created images on the ceph pool). When i try to use the volumes in a statefulset i get the following error
Warning FailedMount 14s (x3 over 4m) kubelet, node2 Unable to mount volumes for pod "mysql-0_default(2c4858f6-2dd1-11e8-9506-00505695ceef)": timeout expired waiting for volumes to attach/mount for pod "default"/"mysql-0". list of unattached/unmounted volumes=[db]
The error is a catch all so kubelet logs are included below
What you expected to happen:
For the pod to mount its volume properly.
How to reproduce it (as minimally and precisely as possible):
I use the following yaml files to deploy ressources
StorageClass
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: cephrbd
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/rbd
parameters:
monitors: 172.29.1.138:6789, 172.29.1.139:6789, 172.29.1.140:6789
adminId: admin
adminSecretName: ceph-secret
pool: kube
fsType: ext4
userId: kube
userSecretName: ceph-user-secret
first PVC (2 others similar)
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: db-mysql-0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
StatefulSet
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mysql
labels:
name: mysql
spec:
serviceName: "mysql"
replicas: 3
template:
metadata:
labels:
name: mysql
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mysql
image: perconalab/percona-xtradb-cluster:5.6
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: k8spassword
- name: DISCOVERY_SERVICE
value: etcd:2379
- name: XTRABACKUP_PASSWORD
value: k8spassword
- name: CLUSTER_NAME
value: percona
volumeMounts:
- name: db
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: db
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Anything else we need to know?: Looking in the kubelet logs i get the following lines :
Mar 22 14:01:29 node2 kubelet[2046]: I0322 14:01:29.115165 2046 reconciler.go:262] operationExecutor.MountVolume started for volume "pvc-52ea13b0-2dd9-11e8-9506-00505695ceef" (UniqueName: "kubernetes.io/rbd/[172.29.1.138:6789 172.29.1.139:6789 172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef") pod "mysql-0" (UID: "58b6eaae-2dd9-11e8-9506-00505695ceef")
Mar 22 14:01:29 node2 kubelet[2046]: I0322 14:01:29.115825 2046 operation_generator.go:446] MountVolume.WaitForAttach entering for volume "pvc-52ea13b0-2dd9-11e8-9506-00505695ceef" (UniqueName: "kubernetes.io/rbd/[172.29.1.138:6789 172.29.1.139:6789 172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef") pod "mysql-0" (UID: "58b6eaae-2dd9-11e8-9506-00505695ceef") DevicePath ""
Mar 22 14:01:29 node2 kubelet[2046]: E0322 14:01:29.402591 2046 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[172.29.1.138:6789 172.29.1.139:6789 172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef\"" failed. No retries permitted until 2018-03-22 14:02:33.402150246 +0000 UTC m=+245312.690283744 (durationBeforeRetry 1m4s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-52ea13b0-2dd9-11e8-9506-00505695ceef\" (UniqueName: \"kubernetes.io/rbd/[172.29.1.138:6789 172.29.1.139:6789 172.29.1.140:6789]:kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef\") pod \"mysql-0\" (UID: \"58b6eaae-2dd9-11e8-9506-00505695ceef\") : error: exit status 1, rbd output: 2018-03-22 14:01:29.382939 7f0c5e8a1100 -1 did not load config file, using default settings.\nrbd: error opening image kubernetes-dynamic-pvc-52ecfdb3-2dd9-11e8-86c8-00505695ceef: (1) Operation not permitted\n2018-03-22 14:01:29.398427 7f0c2ffff700 -1 librbd::image::OpenRequest: failed to retrieve image id: (1) Operation not permitted\n2018-03-22 14:01:29.398483 7f0c2f7fe700 -1 librbd::ImageState: 0x55825945e0c0 failed to open image: (1) Operation not permitted\n"
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3+coreos.0", GitCommit:"f588569ed1bd4a6c986205dd0d7b04da4ab1a3b6", GitTreeState:"clean", BuildDate:"2018-02-10T01:42:55Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
- Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.3 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
- Kernel (e.g.
uname -a):
Linux node2 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
- Install tools: Installed the cluster using kubespray on Vsphere virtual machines
- Others:
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 7
- Comments: 27 (3 by maintainers)
I will check 1.9.2
I experienced a very similar issue on k8s version 1.10. However, it wasn’t isolated to stateful sets. I was unable to mount any rbd persistent volumes. I downgraded to 1.9.2 and everything worked again.