rook: csi-rbdplugin missing after node failure
Is this a bug report or feature request?
- Bug Report After a worker node fails there isn’t any csi-rbdplugin pod anymore. How can I recreate them? For details see this output:
NAME READY STATUS RESTARTS AGE
rook-ceph-crashcollector-vsrvk8w001-58b685f754-4862x 1/1 Running 0 7d8h
rook-ceph-crashcollector-vsrvk8w002-56f7cf79c-dp5g7 1/1 Running 0 7d8h
rook-ceph-crashcollector-vsrvk8w003-b6bc5db68-lmdqp 1/1 Running 0 33h
rook-ceph-mgr-a-5bcf49455f-kbpc5 1/1 Running 0 7d8h
rook-ceph-mon-a-cfd699798-czw7b 1/1 Running 0 7d8h
rook-ceph-mon-b-6c765d57d7-wcm5h 1/1 Running 0 33h
rook-ceph-mon-c-549dc995fc-b64xf 1/1 Running 0 7d8h
rook-ceph-mon-d-canary-5867bbd5c7-rvt9b 0/1 Pending 0 8m53s
rook-ceph-operator-6d8fb9498b-r9czf 1/1 Running 1 3d5h
rook-ceph-osd-0-9445b78c-7kbgv 1/1 Running 0 7d8h
rook-ceph-osd-1-85d485d67c-ftchf 1/1 Running 0 7d8h
rook-ceph-osd-2-b4d7b8995-2lctw 1/1 Running 0 33h
rook-ceph-osd-prepare-vsrvk8w001-2w827 0/1 Completed 0 8h
rook-ceph-osd-prepare-vsrvk8w002-qsrlh 0/1 Completed 0 8h
rook-ceph-tools-685d84df94-sr2xn 1/1 Running 0 8m7s
rook-discover-5nrtr 1/1 Running 0 7d8h
rook-discover-cvsx2 1/1 Running 1 7d8h
rook-discover-gqfhw 1/1 Running 0 7d8h
Deviation from expected behavior:
Expected behavior: There should be the csi plugins.
How to reproduce it (minimal and precise): One worker runs out of memory so it stopped some pods.
File(s) to submit:
- Cluster CR (custom resource), typically called
cluster.yaml, if necessary - Operator’s logs, if necessary
- Crashing pod(s) logs, if necessary
To get logs, use kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
Read Github documentation if you need help.
Environment:
- OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
- Kernel (e.g.
uname -a): 4.19.0-6-amd64 - Cloud provider or hardware configuration:
- Rook version (use
rook versioninside of a Rook Pod): 1.2.1 - Storage backend version (e.g. for ceph do
ceph -v):14.2.5 - Kubernetes version (use
kubectl version):1.16.4 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox): HEALTH_WARN 2 slow ops, oldest one blocked for 116618 sec, mon.a has slow ops; too few PGs per OSD (8 < min 30); 1/3 mons down, quorum a,c
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 16 (7 by maintainers)
if you restart operator pod the csi pods should get recreated. can you paste
kubectl cm -n<rook-namespace>output?