rook: The osd pod failed randomly when creating 2 rook-ceph clusters on the same set of storage nodes

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior: One of two osd pods failed randomly.

Expected behavior: All osd pods should be in “Running” state consistently.

How to reproduce it (minimal and precise):

1). Failure case: Two clusters with one cluster with 1 device ‘vdd’ and the other cluster with 1 device ‘vde’. The two clusters are using the same set of nodes. Each node has both vdd and vde as storage node.

Note: Some successful case for reference. Case 1: one rook-ceph cluster with the same set of storage nodes usinging “vdd” and “vde” . Case 2: two rook-ceph clusters with two different set of storage nodes each having one device ‘vdd’. Case 3: two rook-ceph clustesr with the same set of storage nodes each having one device ‘vdd’ and one directory ( /data/rook)

2). The 2nd cluster will have one or two osd pods fails to be brought up randomly.

=======kubectl -n rook-ceph-system get pods -o wide ==== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE rook-ceph-agent-6bh2q 1/1 Running 0 29m 192.188.88.14 rook-control-01 <none> rook-ceph-agent-7qbfz 1/1 Running 0 29m 192.188.88.12 rook-worker-03 <none> rook-ceph-agent-9dtj9 1/1 Running 0 29m 192.188.88.8 rook-worker-02 <none> rook-ceph-agent-q4mp8 1/1 Running 0 29m 192.188.88.13 rook-control-03 <none> rook-ceph-agent-rwgw6 1/1 Running 0 29m 192.188.88.10 rook-control-02 <none> rook-ceph-agent-t5tv9 1/1 Running 0 29m 192.188.88.17 rook-worker-01 <none> rook-ceph-operator-7545bffc9b-z66zd 1/1 Running 0 29m 192.168.1.188 rook-control-01 <none> rook-discover-7229c 1/1 Running 0 29m 192.168.1.198 rook-worker-03 <none> rook-discover-gxvzl 1/1 Running 0 29m 192.168.1.88 rook-control-03 <none> rook-discover-l2bm9 1/1 Running 0 29m 192.168.1.173 rook-worker-01 <none> rook-discover-p82ft 1/1 Running 0 29m 192.168.1.16 rook-control-02 <none> rook-discover-qkfpr 1/1 Running 0 29m 192.168.1.136 rook-control-01 <none> rook-discover-vcr2m 1/1 Running 0 29m 192.168.1.254 rook-worker-02 <none> =======kubectl -n rook-ceph get pods -o wide ==== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ceph-oam-6dfd5ffccc-bl9cs 2/2 Running 0 28m 192.188.88.8 rook-worker-02 <none> rook-ceph-mgr-a-f445598f9-hnhw4 1/1 Running 0 28m 192.168.1.145 rook-control-01 <none> rook-ceph-mon-a-5cb7f699c7-6lztx 1/1 Running 0 28m 192.168.1.134 rook-control-01 <none> rook-ceph-mon-b-56bd6c759c-xxx7q 1/1 Running 0 28m 192.168.1.34 rook-control-02 <none> rook-ceph-mon-c-6c6b779f86-bnh4r 1/1 Running 0 28m 192.168.1.91 rook-control-03 <none> rook-ceph-osd-0-5b6dd9f967-knvnz 1/1 Running 0 27m 192.168.1.176 rook-control-01 <none> rook-ceph-osd-1-7dd9f79bb7-4q87x 1/1 Running 0 27m 192.168.1.13 rook-control-02 <none> rook-ceph-osd-2-967dbf495-knllx 1/1 Running 0 27m 192.168.1.98 rook-control-03 <none> rook-ceph-osd-prepare-rook-control-01-jv65h 0/2 Completed 1 27m 192.168.1.148 rook-control-01 <none> rook-ceph-osd-prepare-rook-control-02-fpxkm 0/2 Completed 1 27m 192.168.1.24 rook-control-02 <none> rook-ceph-osd-prepare-rook-control-03-xwjvr 0/2 Completed 1 27m 192.168.1.95 rook-control-03 <none> rook-ceph-rgw-rook-ceph-store-5f684dc766-4k7lt 1/1 Running 0 26m 192.168.1.196 rook-worker-02 <none> rook-ceph-tools-6b8b666d47-q5lj2 1/1 Running 0 28m 192.188.88.12 rook-worker-03 <none> =======kubectl -n rook-ceph2 get pods -o wide ==== NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ceph-oam-6dfd5ffccc-67z74 2/2 Running 0 28m 192.188.88.17 rook-worker-01 <none> rook-ceph-mgr-a-6b8b9596db-4ms8x 1/1 Running 0 26m 192.168.1.9 rook-control-02 <none> rook-ceph-mon-a-6669d55bc-v4h6d 1/1 Running 0 26m 192.168.1.132 rook-control-01 <none> rook-ceph-mon-b-84fcc48df4-hkfvd 1/1 Running 0 26m 192.168.1.35 rook-control-02 <none> rook-ceph-mon-c-5fbfd69c76-v68bv 1/1 Running 0 26m 192.168.1.97 rook-control-03 <none> rook-ceph-osd-0-78f9487846-d7qf8 1/1 Running 0 25m 192.168.1.128 rook-control-01 <none> rook-ceph-osd-1-5d7976f4d6-lzvw4 0/1 CrashLoopBackOff 9 25m 192.168.1.2 rook-control-02 <none> rook-ceph-osd-2-5646cd948d-czslc 1/1 Running 0 25m 192.168.1.99 rook-control-03 <none> rook-ceph-osd-prepare-rook-control-01-9g2d2 0/2 Completed 0 25m 192.168.1.155 rook-control-01 <none> rook-ceph-osd-prepare-rook-control-02-zgfkh 0/2 Completed 0 25m 192.168.1.14 rook-control-02 <none> rook-ceph-osd-prepare-rook-control-03-k4j5q 0/2 Completed 0 25m 192.168.1.100 rook-control-03 <none> rook-ceph-tools-6b8b666d47-qthc2 1/1 Running 0 28m 192.188.88.12 rook-worker-03 <none>

3). The rook-ceph2 has one pod failure. And rook-ceph1 is healthy with all 3 pods up.

ceph -s cluster: id: f6f4f98b-8ca2-4817-99d3-be22b0cded16 health: HEALTH_OK

services: mon: 3 daemons, quorum b,c,a mgr: a(active) osd: 3 osds: 2 up, 2 in

data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 2.0 GiB used, 8.0 GiB / 10 GiB avail pgs:

ceph osd status ±—±----------------±------±------±-------±--------±-------±--------±-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | ±—±----------------±------±------±-------±--------±-------±--------±-----------+ | 0 | rook-control-01 | 1026M | 4090M | 0 | 0 | 0 | 0 | exists,up | | 1 | | 0 | 0 | 0 | 0 | 0 | 0 | exists,new | | 2 | rook-control-03 | 1026M | 4090M | 0 | 0 | 0 | 0 | exists,up | ±—±----------------±------±------±-------±--------±-------±--------±-----------+

4). Failure pod log

kubectl -n rook-ceph2 log rook-ceph-osd-1-5d7976f4d6-lzvw4 2019-02-21 15:51:16.059213 I | rookcmd: starting Rook v0.9.2 with arguments ‘/rook/rook ceph osd start – --foreground --id 1 --osd-uuid 33615b87-041b-46ef-b0ad-2b8965916930 --conf /var/lib/rook/osd1/rook-ceph2.config --cluster ceph’ 2019-02-21 15:51:16.059453 I | rookcmd: flag values: --help=false, --log-level=INFO, --osd-id=1, --osd-store-type=bluestore, --osd-uuid=33615b87-041b-46ef-b0ad-2b8965916930 2019-02-21 15:51:16.059464 I | cephmon: parsing mon endpoints: 2019-02-21 15:51:16.059469 W | cephmon: ignoring invalid monitor 2019-02-21 15:51:16.059835 I | exec: Running command: ceph-volume lvm activate --no-systemd --bluestore 1 33615b87-041b-46ef-b0ad-2b8965916930 2019-02-21 15:51:19.683866 I | Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-1 2019-02-21 15:51:19.683909 I | Running command: /usr/sbin/restorecon /var/lib/ceph/osd/ceph-1 2019-02-21 15:51:19.683914 I | Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-a99d9aa9-e356-4fd3-96c8-aade831755f0/osd-data-b5ae0fcd-e483-4169-aa77-a51df8579ad3 --path /var/lib/ceph/osd/ceph-1 --no-mon-config 2019-02-21 15:51:19.683920 I | Running command: /bin/ln -snf /dev/ceph-a99d9aa9-e356-4fd3-96c8-aade831755f0/osd-data-b5ae0fcd-e483-4169-aa77-a51df8579ad3 /var/lib/ceph/osd/ceph-1/block 2019-02-21 15:51:19.683923 I | Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-1/block 2019-02-21 15:51:19.683926 I | Running command: /bin/chown -R ceph:ceph /dev/mapper/ceph–a99d9aa9–e356–4fd3–96c8–aade831755f0-osd–data–b5ae0fcd–e483–4169–aa77–a51df8579ad3 2019-02-21 15:51:19.683929 I | Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-1 2019-02-21 15:51:19.683932 I | --> ceph-volume lvm activate successful for osd ID: 1 2019-02-21 15:51:19.694440 I | exec: Running command: ceph-osd --foreground --id 1 --osd-uuid 33615b87-041b-46ef-b0ad-2b8965916930 --conf /var/lib/rook/osd1/rook-ceph2.config --cluster ceph 2019-02-21 15:51:19.805074 I | failed to fetch mon config (–no-mon-config to skip) failed to start osd. Failed to complete ‘’: exit status 1.

5). The logs will be attached. rook-logs02-21.gz

Environment:

OS (e.g. from /etc/os-release): rhel 7.5
Kernel (e.g. uname -a): Linux rook-control-01 3.10.0-862.14.4.el7.x86_64 #1 SMP Fri Sep 21 09:07:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration: openstack
Rook version (use rook version inside of a Rook Pod): v0.9.2
Kubernetes version (use kubectl version): v1.12.3
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Tectonic
Storage backend status (e.g. for Ceph use ceph health in the [Rook Ceph toolbox] (https://rook.io/docs/Rook/master/toolbox.html)):

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 22 (9 by maintainers)

Most upvoted comments

#3266 is merged, so this will be included with 1.0.2…

travisn on Jun 6, 2019

Yes, this is being backported and will be out in 1.0.2 hopefully in a couple days. I’ll reopen to track the backport.

travisn on Jun 6, 2019

Ok, I see the bug in the osd creation. The cluster fsid is returned by ceph-volume list so rook needs to match it to the current cluster fsid instead of assuming it already belongs to the cluster.

$ ceph-volume lvm list --format json
...
            "tags": {
                "ceph.cluster_fsid": "aeec10f6-b350-4ad3-a670-486c0d45feed", 
                ...
             }

The getCephVolumeOSDs method should only return the OSDs that correspond to the local cluster ID. Until this issue is fixed, the only workaround for running multiple ceph clusters is to have the different clusters consume devices on different nodes.

travisn on May 24, 2019

When will this fix be available? Will this be part of 1.0.x?

yanchicago on Jun 5, 2019