rook: osd pods goes in Init:CrashLoopBackOff status

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: OSD pods after start ends in Init:CrashLoopBackOff status

rook-ceph-osd-0-77ddf7ff5b-88gpt                               0/1       Init:CrashLoopBackOff   25         2h
rook-ceph-osd-1-f6d4df8bf-2vjvx                                0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-10-6587569446-9c682                              0/1       Init:CrashLoopBackOff   21         2h
rook-ceph-osd-11-7bcc9cc7b9-ff9hv                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-12-6f9f9fd689-8kt7v                              0/1       Init:CrashLoopBackOff   21         2h
rook-ceph-osd-13-57bccd8c96-msjs5                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-14-6989688986-4mmwp                              0/1       Init:CrashLoopBackOff   21         2h
rook-ceph-osd-15-df75b4cff-zmdv6                               0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-16-6ff9f6df47-6s7zt                              0/1       Init:CrashLoopBackOff   24         2h
rook-ceph-osd-17-94cbd756-z9r9c                                0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-18-656dbd5df8-jsmsv                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-19-78dbdf68b5-lvtp5                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-2-685bb8c775-gswcc                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-20-d9495d544-4hqn6                               0/1       Init:CrashLoopBackOff   25         2h
rook-ceph-osd-21-6cb6bbfd9-fmtj4                               0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-22-7cfdd8bb5c-7db94                              0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-23-55fd975d9d-stzgm                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-24-576ddd64b4-ctb9c                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-25-c777455bc-9c5gh                               0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-26-5d66d9bdc5-k5mgs                              0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-27-8496c586fb-slhdm                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-28-56d7897c86-ktnvr                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-29-6b999977-74rj4                                0/1       Init:CrashLoopBackOff   25         2h
rook-ceph-osd-3-c4478bf6b-pgncq                                0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-30-7f59dccfbb-zplpt                              0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-31-bf4794779-vpjqs                               0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-32-7c9866d6ff-5j2mm                              0/1       Init:CrashLoopBackOff   28         2h
rook-ceph-osd-33-76df8b568-t2zxv                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-34-6498d8c44b-8k7pj                              0/1       Init:Error              27         2h
rook-ceph-osd-35-67fdd5f57-xr6rt                               0/1       Init:Error              28         2h
rook-ceph-osd-36-96b85ffd7-rwjqd                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-37-7c4b54bbd6-sq7zt                              0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-38-77855ff7d-5tcmw                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-39-7fb44cfdf5-456q9                              0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-4-58fdbd448d-xvj7k                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-40-f796fddf6-xdbjj                               0/1       Init:CrashLoopBackOff   27         2h
rook-ceph-osd-41-68bcf56657-h6mqb                              0/1       Init:CrashLoopBackOff   27         1h
rook-ceph-osd-42-55b657ff45-p82r4                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-43-589c7f9546-kzhjh                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-44-86b8dfc67d-kwwfd                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-45-6646957877-lvz5t                              0/1       Init:CrashLoopBackOff   24         1h
rook-ceph-osd-46-7b75b44df7-dmw85                              0/1       Init:Error              27         1h
rook-ceph-osd-47-556d59df86-8mp2j                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-48-5f74fb8d68-8mt7h                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-49-6f45b9fdcc-2slzq                              0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-5-74455749df-vcrxm                               0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-50-7459ddc7d-529ks                               0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-51-6cdfc64645-ghxb4                              1/1       Running                 0          1h
rook-ceph-osd-52-db98f5745-2kpc2                               0/1       Init:CrashLoopBackOff   26         1h
rook-ceph-osd-53-55cc8b89c6-rrw6f                              0/1       Init:CrashLoopBackOff   25         1h
rook-ceph-osd-54-846fd97fd4-2pcmr                              0/1       Init:CrashLoopBackOff   25         1h
rook-ceph-osd-55-667fc98c9-ptjz2                               1/1       Running                 0          1h
rook-ceph-osd-56-848684c7f9-zddxt                              0/1       Init:CrashLoopBackOff   25         1h
rook-ceph-osd-57-854bdb68fb-ls76w                              1/1       Running                 0          1h
rook-ceph-osd-58-598c5f44c4-g6gq4                              1/1       Running                 0          1h
rook-ceph-osd-59-84fdfcc9f5-lbm4t                              1/1       Running                 0          1h
rook-ceph-osd-6-86556c6b4c-dqk2l                               1/1       Running                 0          1h
rook-ceph-osd-7-754c4cd94-q6s2r                                1/1       Running                 0          1h
rook-ceph-osd-8-c546c786d-wllls                                0/1       Init:CrashLoopBackOff   25         1h
rook-ceph-osd-9-557994fd8f-gsp4p                               1/1       Running                 0          1h

Expected behavior:

How to reproduce it (minimal and precise):

  • Install CEPH with rook on OCP v3.11
  • in this configuration there are 60 OSDs

Environment:

  • OS (e.g. from /etc/os-release):
# cat /etc/os-release 
NAME="Red Hat Enterprise Linux Server"
VERSION="7.6 (Maipo)"
ID="rhel"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="7.6"
PRETTY_NAME="Red Hat Enterprise Linux Server 7.6 (Maipo)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:7.6:GA:server"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 7"
REDHAT_BUGZILLA_PRODUCT_VERSION=7.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="7.6"
  • Kernel (e.g. uname -a):
3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Cloud provider or hardware configuration:
  • Rook version (use rook version inside of a Rook Pod):
# rook version
 rook: v0.9.2
  • Kubernetes version (use kubectl version):
# kubectl version
Client Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2019-02-11T04:22:37Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11+", GitVersion:"v1.11.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2019-02-11T04:22:37Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
OpenShift v3.11
# ceph -s
  cluster:
    id:     a7fba0fe-7fe1-4dcf-9a0b-3136819967ef
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum b,c,a
    mgr: a(active)
    osd: 60 osds: 60 up, 60 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   62 GiB used, 27 TiB / 27 TiB avail
    pgs:     
 

OSDs in ceph cluster are up

# ceph osd tree 
ID CLASS WEIGHT   TYPE NAME                                                    STATUS REWEIGHT PRI-AFF 
-1       27.25800 root default                                                                         
-5        6.81450     host c10-hxx-node                
 0   hdd  0.45430         osd.0                                                    up  1.00000 1.00000 
 4   hdd  0.45430         osd.4                                                    up  1.00000 1.00000 
 8   hdd  0.45430         osd.8                                                    up  1.00000 1.00000 
12   hdd  0.45430         osd.12                                                   up  1.00000 1.00000 
16   hdd  0.45430         osd.16                                                   up  1.00000 1.00000 
20   hdd  0.45430         osd.20                                                   up  1.00000 1.00000 
25   hdd  0.45430         osd.25                                                   up  1.00000 1.00000 
29   hdd  0.45430         osd.29                                                   up  1.00000 1.00000 
33   hdd  0.45430         osd.33                                                   up  1.00000 1.00000 
37   hdd  0.45430         osd.37                                                   up  1.00000 1.00000 
41   hdd  0.45430         osd.41                                                   up  1.00000 1.00000 
45   hdd  0.45430         osd.45                                                   up  1.00000 1.00000 
48   hdd  0.45430         osd.48                                                   up  1.00000 1.00000 
52   hdd  0.45430         osd.52                                                   up  1.00000 1.00000 
56   hdd  0.45430         osd.56                                                   up  1.00000 1.00000 
-3        6.81450     host c10-hxx-node                       
 1   hdd  0.45430         osd.1                                                    up  1.00000 1.00000 
 5   hdd  0.45430         osd.5                                                    up  1.00000 1.00000 
 9   hdd  0.45430         osd.9                                                    up  1.00000 1.00000 
13   hdd  0.45430         osd.13                                                   up  1.00000 1.00000 
17   hdd  0.45430         osd.17                                                   up  1.00000 1.00000 
21   hdd  0.45430         osd.21                                                   up  1.00000 1.00000 
24   hdd  0.45430         osd.24                                                   up  1.00000 1.00000 
28   hdd  0.45430         osd.28                                                   up  1.00000 1.00000 
32   hdd  0.45430         osd.32                                                   up  1.00000 1.00000 
36   hdd  0.45430         osd.36                                                   up  1.00000 1.00000 
40   hdd  0.45430         osd.40                                                   up  1.00000 1.00000 
44   hdd  0.45430         osd.44                                                   up  1.00000 1.00000 
49   hdd  0.45430         osd.49                                                   up  1.00000 1.00000 
53   hdd  0.45430         osd.53                                                   up  1.00000 1.00000 
57   hdd  0.45430         osd.57                                                   up  1.00000 1.00000 
-7        6.81450     host c10-hxx-node                        
 2   hdd  0.45430         osd.2                                                    up  1.00000 1.00000 
 6   hdd  0.45430         osd.6                                                    up  1.00000 1.00000 
10   hdd  0.45430         osd.10                                                   up  1.00000 1.00000 
14   hdd  0.45430         osd.14                                                   up  1.00000 1.00000 
18   hdd  0.45430         osd.18                                                   up  1.00000 1.00000 
22   hdd  0.45430         osd.22                                                   up  1.00000 1.00000 
26   hdd  0.45430         osd.26                                                   up  1.00000 1.00000 
30   hdd  0.45430         osd.30                                                   up  1.00000 1.00000 
34   hdd  0.45430         osd.34                                                   up  1.00000 1.00000 
38   hdd  0.45430         osd.38                                                   up  1.00000 1.00000 
42   hdd  0.45430         osd.42                                                   up  1.00000 1.00000 
46   hdd  0.45430         osd.46                                                   up  1.00000 1.00000 
50   hdd  0.45430         osd.50                                                   up  1.00000 1.00000 
54   hdd  0.45430         osd.54                                                   up  1.00000 1.00000 
58   hdd  0.45430         osd.58                                                   up  1.00000 1.00000 
-9        6.81450     host c10-hxx-node                         
 3   hdd  0.45430         osd.3                                                    up  1.00000 1.00000 
 7   hdd  0.45430         osd.7                                                    up  1.00000 1.00000 
11   hdd  0.45430         osd.11                                                   up  1.00000 1.00000 
15   hdd  0.45430         osd.15                                                   up  1.00000 1.00000 
19   hdd  0.45430         osd.19                                                   up  1.00000 1.00000 
23   hdd  0.45430         osd.23                                                   up  1.00000 1.00000 
27   hdd  0.45430         osd.27                                                   up  1.00000 1.00000 
31   hdd  0.45430         osd.31                                                   up  1.00000 1.00000 
35   hdd  0.45430         osd.35                                                   up  1.00000 1.00000 
39   hdd  0.45430         osd.39                                                   up  1.00000 1.00000 
43   hdd  0.45430         osd.43                                                   up  1.00000 1.00000 
47   hdd  0.45430         osd.47                                                   up  1.00000 1.00000 
51   hdd  0.45430         osd.51                                                   up  1.00000 1.00000 
55   hdd  0.45430         osd.55                                                   up  1.00000 1.00000 
59   hdd  0.45430         osd.59                                                   up  1.00000 1.00000 

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (6 by maintainers)

Commits related to this issue

Most upvoted comments

Agreed, as a start at least we could add a check to the copyBinary method so it doesn’t fail to copy if the file is already there from a previous run of the init container. However, from the error, I wonder if we will also get an error when trying to check for the existence of the file. @kshlm Could you investigate this?