rook: CSI plugin registration failing and pods stuck in ContainerCreating
- Bug Report
Deviation from expected behavior: I have these messages logged every 2 minutes by the kubelet service:
Nov 19 09:35:12 fpig-kubeletl022 kubelet[82977]: E1119 09:35:12.385237 82977 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/csi.sock" failed. No retries permitted until 2019-11-19 09:37:14.385220494 -0600 CST m=+639.587100249 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/rook-ceph.rbd.csi.ceph.com/csi.sock, err: rpc error: code = Unimplemented desc = unknown service pluginregistration.Registration"
Nov 19 09:35:12 fpig-kubeletl022 kubelet[82977]: E1119 09:35:12.385343 82977 goroutinemap.go:150] Operation for "/var/lib/kubelet/plugins/rook-ceph.cephfs.csi.ceph.com/csi.sock" failed. No retries permitted until 2019-11-19 09:37:14.385321555 -0600 CST m=+639.587201325 (durationBeforeRetry 2m2s). Error: "RegisterPlugin error -- failed to get plugin info using RPC GetInfo at socket /var/lib/kubelet/plugins/rook-ceph.cephfs.csi.ceph.com/csi.sock, err: rpc error: code = Unimplemented desc = unknown service pluginregistration.Registration
Any pods that I attach a rook block PVC to are stuck in ContainerCreating and I see the following messages:
Nov 19 09:55:15 fpig-kubeletl042 kubelet[55855]: E1119 09:55:15.298963 55855 driver-call.go:274] mount command failed, status: Failure, reason: Rook: Mount volume failed: failed to attach volume replicapool/pvc-65b12ba9-eb87-496a-acd3-46144bd35144: failed to map image replicapool/pvc-65b12ba9-eb87-496a-acd3-46144bd35144 cluster rook-ceph. failed to map image replicapool/pvc-65b12ba9-eb87-496a-acd3-46144bd35144, output: , err: Failed to complete 'rbd': signal: interrupt.
Nov 19 09:55:15 fpig-kubeletl042 kubelet[55855]: W1119 09:55:15.298978 55855 driver-call.go:150] FlexVolume: driver call failed: executable: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/rook.io~rook-ceph/rook-ceph, args: [mount /var/lib/kubelet/pods/bbd911c1-04c3-4ddc-a2c0-ea694159d795/volumes/rook.io~rook-ceph/pvc-65b12ba9-eb87-496a-acd3-46144bd35144 {"clusterNamespace":"rook-ceph","dataBlockPool":"","image":"pvc-65b12ba9-eb87-496a-acd3-46144bd35144","kubernetes.io/fsType":"","kubernetes.io/pod.name":"test","kubernetes.io/pod.namespace":"default","kubernetes.io/pod.uid":"bbd911c1-04c3-4ddc-a2c0-ea694159d795","kubernetes.io/pvOrVolumeName":"pvc-65b12ba9-eb87-496a-acd3-46144bd35144","kubernetes.io/readwrite":"rw","kubernetes.io/serviceAccount.name":"default","pool":"replicapool","storageClass":"rook-block"}], error: exit status 1, output: "{\"status\":\"Failure\",\"message\":\"Rook: Mount volume failed: failed to attach volume replicapool/pvc-65b12ba9-eb87-496a-acd3-46144bd35144: failed to map image replicapool/pvc-65b12ba9-eb87-496a-acd3-46144bd35144 cl
Expected behavior: No error messages and pods successfully mounting and running rook block volumes.
Environment:
- OS (e.g. from /etc/os-release):
RHEL 7.5 - Kernel (e.g.
uname -a):Linux 3.10.0-862.3.2.el7.x86_64 - Cloud provider or hardware configuration:
on-prem vanilla k8s - Rook version (use
rook versioninside of a Rook Pod): rook:v1.1.2 - Storage backend version (e.g. for ceph do
ceph -v):ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable) - Kubernetes version (use
kubectl version):v1.16.2 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
vanilla k8s - Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox):
[root@rook-ceph-tools /]# ceph health
HEALTH_WARN BlueFS spillover detected on 25 OSD(s)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 21 (8 by maintainers)
The same issue
Thanks a lot for the help @Madhu-1.
We managed to fix this by switching our storageclass to use the CSI driver. We also had a networking issue, we had a networkpolicy to allow traffic to the rook namespace, but the rbdplugin pod uses host networking so that didn’t apply.
everything looks fine, can you paste the
root-dirfrom kubelet configuration