rook: osd pods goes in Init:CrashLoopBackOff status
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior:
OSD pods after start ends in Init:CrashLoopBackOff status
rook-ceph-osd-0-77ddf7ff5b-88gpt 0/1 Init:CrashLoopBackOff 25 2h
rook-ceph-osd-1-f6d4df8bf-2vjvx 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-10-6587569446-9c682 0/1 Init:CrashLoopBackOff 21 2h
rook-ceph-osd-11-7bcc9cc7b9-ff9hv 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-12-6f9f9fd689-8kt7v 0/1 Init:CrashLoopBackOff 21 2h
rook-ceph-osd-13-57bccd8c96-msjs5 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-14-6989688986-4mmwp 0/1 Init:CrashLoopBackOff 21 2h
rook-ceph-osd-15-df75b4cff-zmdv6 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-16-6ff9f6df47-6s7zt 0/1 Init:CrashLoopBackOff 24 2h
rook-ceph-osd-17-94cbd756-z9r9c 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-18-656dbd5df8-jsmsv 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-19-78dbdf68b5-lvtp5 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-2-685bb8c775-gswcc 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-20-d9495d544-4hqn6 0/1 Init:CrashLoopBackOff 25 2h
rook-ceph-osd-21-6cb6bbfd9-fmtj4 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-22-7cfdd8bb5c-7db94 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-23-55fd975d9d-stzgm 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-24-576ddd64b4-ctb9c 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-25-c777455bc-9c5gh 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-26-5d66d9bdc5-k5mgs 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-27-8496c586fb-slhdm 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-28-56d7897c86-ktnvr 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-29-6b999977-74rj4 0/1 Init:CrashLoopBackOff 25 2h
rook-ceph-osd-3-c4478bf6b-pgncq 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-30-7f59dccfbb-zplpt 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-31-bf4794779-vpjqs 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-32-7c9866d6ff-5j2mm 0/1 Init:CrashLoopBackOff 28 2h
rook-ceph-osd-33-76df8b568-t2zxv 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-34-6498d8c44b-8k7pj 0/1 Init:Error 27 2h
rook-ceph-osd-35-67fdd5f57-xr6rt 0/1 Init:Error 28 2h
rook-ceph-osd-36-96b85ffd7-rwjqd 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-37-7c4b54bbd6-sq7zt 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-38-77855ff7d-5tcmw 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-39-7fb44cfdf5-456q9 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-4-58fdbd448d-xvj7k 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-40-f796fddf6-xdbjj 0/1 Init:CrashLoopBackOff 27 2h
rook-ceph-osd-41-68bcf56657-h6mqb 0/1 Init:CrashLoopBackOff 27 1h
rook-ceph-osd-42-55b657ff45-p82r4 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-43-589c7f9546-kzhjh 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-44-86b8dfc67d-kwwfd 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-45-6646957877-lvz5t 0/1 Init:CrashLoopBackOff 24 1h
rook-ceph-osd-46-7b75b44df7-dmw85 0/1 Init:Error 27 1h
rook-ceph-osd-47-556d59df86-8mp2j 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-48-5f74fb8d68-8mt7h 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-49-6f45b9fdcc-2slzq 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-5-74455749df-vcrxm 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-50-7459ddc7d-529ks 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-51-6cdfc64645-ghxb4 1/1 Running 0 1h
rook-ceph-osd-52-db98f5745-2kpc2 0/1 Init:CrashLoopBackOff 26 1h
rook-ceph-osd-53-55cc8b89c6-rrw6f 0/1 Init:CrashLoopBackOff 25 1h
rook-ceph-osd-54-846fd97fd4-2pcmr 0/1 Init:CrashLoopBackOff 25 1h
rook-ceph-osd-55-667fc98c9-ptjz2 1/1 Running 0 1h
rook-ceph-osd-56-848684c7f9-zddxt 0/1 Init:CrashLoopBackOff 25 1h
rook-ceph-osd-57-854bdb68fb-ls76w 1/1 Running 0 1h
rook-ceph-osd-58-598c5f44c4-g6gq4 1/1 Running 0 1h
rook-ceph-osd-59-84fdfcc9f5-lbm4t 1/1 Running 0 1h
rook-ceph-osd-6-86556c6b4c-dqk2l 1/1 Running 0 1h
rook-ceph-osd-7-754c4cd94-q6s2r 1/1 Running 0 1h
rook-ceph-osd-8-c546c786d-wllls 0/1 Init:CrashLoopBackOff 25 1h
rook-ceph-osd-9-557994fd8f-gsp4p 1/1 Running 0 1h
Expected behavior:
How to reproduce it (minimal and precise):
- Install CEPH with rook on OCP v3.11
- in this configuration there are 60 OSDs
Environment:
- OS (e.g. from /etc/os-release):
# cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.6 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"
- Kernel (e.g.
uname -a):
3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
- Cloud provider or hardware configuration:
- Rook version (use
rook versioninside of a Rook Pod):
# rook version
rook: v0.9.2
- Kubernetes version (use
kubectl version):
# kubectl version
Client Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2019-02-11T04:22:37Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2019-02-11T04:22:37Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
- Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
OpenShift v3.11
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox):
# ceph -s
cluster:
id: a7fba0fe-7fe1-4dcf-9a0b-3136819967ef
health: HEALTH_OK
services:
mon: 3 daemons, quorum b,c,a
mgr: a(active)
osd: 60 osds: 60 up, 60 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 62 GiB used, 27 TiB / 27 TiB avail
pgs:
OSDs in ceph cluster are up
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 27.25800 root default
-5 6.81450 host c10-hxx-node
0 hdd 0.45430 osd.0 up 1.00000 1.00000
4 hdd 0.45430 osd.4 up 1.00000 1.00000
8 hdd 0.45430 osd.8 up 1.00000 1.00000
12 hdd 0.45430 osd.12 up 1.00000 1.00000
16 hdd 0.45430 osd.16 up 1.00000 1.00000
20 hdd 0.45430 osd.20 up 1.00000 1.00000
25 hdd 0.45430 osd.25 up 1.00000 1.00000
29 hdd 0.45430 osd.29 up 1.00000 1.00000
33 hdd 0.45430 osd.33 up 1.00000 1.00000
37 hdd 0.45430 osd.37 up 1.00000 1.00000
41 hdd 0.45430 osd.41 up 1.00000 1.00000
45 hdd 0.45430 osd.45 up 1.00000 1.00000
48 hdd 0.45430 osd.48 up 1.00000 1.00000
52 hdd 0.45430 osd.52 up 1.00000 1.00000
56 hdd 0.45430 osd.56 up 1.00000 1.00000
-3 6.81450 host c10-hxx-node
1 hdd 0.45430 osd.1 up 1.00000 1.00000
5 hdd 0.45430 osd.5 up 1.00000 1.00000
9 hdd 0.45430 osd.9 up 1.00000 1.00000
13 hdd 0.45430 osd.13 up 1.00000 1.00000
17 hdd 0.45430 osd.17 up 1.00000 1.00000
21 hdd 0.45430 osd.21 up 1.00000 1.00000
24 hdd 0.45430 osd.24 up 1.00000 1.00000
28 hdd 0.45430 osd.28 up 1.00000 1.00000
32 hdd 0.45430 osd.32 up 1.00000 1.00000
36 hdd 0.45430 osd.36 up 1.00000 1.00000
40 hdd 0.45430 osd.40 up 1.00000 1.00000
44 hdd 0.45430 osd.44 up 1.00000 1.00000
49 hdd 0.45430 osd.49 up 1.00000 1.00000
53 hdd 0.45430 osd.53 up 1.00000 1.00000
57 hdd 0.45430 osd.57 up 1.00000 1.00000
-7 6.81450 host c10-hxx-node
2 hdd 0.45430 osd.2 up 1.00000 1.00000
6 hdd 0.45430 osd.6 up 1.00000 1.00000
10 hdd 0.45430 osd.10 up 1.00000 1.00000
14 hdd 0.45430 osd.14 up 1.00000 1.00000
18 hdd 0.45430 osd.18 up 1.00000 1.00000
22 hdd 0.45430 osd.22 up 1.00000 1.00000
26 hdd 0.45430 osd.26 up 1.00000 1.00000
30 hdd 0.45430 osd.30 up 1.00000 1.00000
34 hdd 0.45430 osd.34 up 1.00000 1.00000
38 hdd 0.45430 osd.38 up 1.00000 1.00000
42 hdd 0.45430 osd.42 up 1.00000 1.00000
46 hdd 0.45430 osd.46 up 1.00000 1.00000
50 hdd 0.45430 osd.50 up 1.00000 1.00000
54 hdd 0.45430 osd.54 up 1.00000 1.00000
58 hdd 0.45430 osd.58 up 1.00000 1.00000
-9 6.81450 host c10-hxx-node
3 hdd 0.45430 osd.3 up 1.00000 1.00000
7 hdd 0.45430 osd.7 up 1.00000 1.00000
11 hdd 0.45430 osd.11 up 1.00000 1.00000
15 hdd 0.45430 osd.15 up 1.00000 1.00000
19 hdd 0.45430 osd.19 up 1.00000 1.00000
23 hdd 0.45430 osd.23 up 1.00000 1.00000
27 hdd 0.45430 osd.27 up 1.00000 1.00000
31 hdd 0.45430 osd.31 up 1.00000 1.00000
35 hdd 0.45430 osd.35 up 1.00000 1.00000
39 hdd 0.45430 osd.39 up 1.00000 1.00000
43 hdd 0.45430 osd.43 up 1.00000 1.00000
47 hdd 0.45430 osd.47 up 1.00000 1.00000
51 hdd 0.45430 osd.51 up 1.00000 1.00000
55 hdd 0.45430 osd.55 up 1.00000 1.00000
59 hdd 0.45430 osd.59 up 1.00000 1.00000
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (6 by maintainers)
Commits related to this issue
- ceph: Check before copying binaries in osd pods The OSD init container now checks destination path, and only performs the copying if the destination is not present. The OSD pods copy the rook and ti... — committed to kshlm/rook by kshlm 5 years ago
- ceph: Check before copying binaries in osd pods Before performing the copy, the OSD init container now checks destination path exists, and unlinks it if it does. The OSD pods copy the rook and tini ... — committed to kshlm/rook by kshlm 5 years ago
- ceph: Check before copying binaries in osd pods Before performing the copy, the OSD init container now checks destination path exists, and unlinks it if it does. The OSD pods copy the rook and tini ... — committed to kshlm/rook by kshlm 5 years ago
- ceph: Check before copying binaries in osd pods Before performing the copy, the OSD init container now checks destination path exists. If the destination exists, it skips the copy. The OSD pods copy... — committed to kshlm/rook by kshlm 5 years ago
- ceph: Check before copying binaries in osd pods Before performing the copy, the OSD init container now checks destination path exists. If the destination exists, it skips the copy. The OSD pods copy... — committed to kshlm/rook by kshlm 5 years ago
Agreed, as a start at least we could add a check to the copyBinary method so it doesn’t fail to copy if the file is already there from a previous run of the init container. However, from the error, I wonder if we will also get an error when trying to check for the existence of the file. @kshlm Could you investigate this?