rook: Cluster unavailable after node reboot, symlink already exist

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior: I’m using Rook Ceph with specifics devices, identified by ids

helm_cephrook_nodes_devices:
  - name: "vm-kube-slave-1"
    devices:
      - name: "/dev/disk/by-id/scsi-36000c29d381154d5114acf6c54b09ab5"
      [.......]

Linux disk letter sdX can change when rebooting, and should not break the application Actually, when starting the OSD, the init container activate detects the right new disk, but a symlink is already present to the old one

'
found device: /dev/sdg
+ DEVICE=/dev/sdg
+ [[ -z /dev/sdg ]]
+ ceph-volume raw activate --device /dev/sdg --no-systemd --no-tmpfs
Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3
Running command: /usr/bin/ceph-bluestore-tool prime-osd-dir --path /var/lib/ceph/osd/ceph-3 --no-mon-config --dev /dev/sdg
Running command: /usr/bin/chown -R ceph:ceph /dev/sdg
Running command: /usr/bin/ln -s /dev/sdg /var/lib/ceph/osd/ceph-3/block
 stderr: ln: failed to create symbolic link '/var/lib/ceph/osd/ceph-3/block': File exists
Traceback (most recent call last):
  File "/usr/sbin/ceph-volume", line 11, in <module>
    load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
    self.main(self.argv)
  File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py", line 32, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
  File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py", line 166, in main
    systemd=not self.args.no_systemd)
  File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py", line 88, in activate
    systemd=systemd)
  File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/activate.py", line 48, in activate_bluestore
    prepare_utils.link_block(meta['device'], osd_id)
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 371, in link_block
    _link_device(block_device, 'block', osd_id)
  File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 339, in _link_device
    process.run(command)
  File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 147, in run
    raise RuntimeError(msg)
RuntimeError: command returned non-zero exit status: 1

Expected behavior: Rook Ceph detect the good disk when the node reboot, even if the letter sdX change the symlink should be recreated

How to reproduce it (minimal and precise):

File(s) to submit:

Cluster CR (custom resource), typically called cluster.yaml, if necessary

Logs to submit:

Operator’s logs, if necessary
Crashing pod(s) logs, if necessary

To get logs, use kubectl -n <namespace> logs <pod name> When pasting logs, always surround them with backticks or use the insert code button from the Github UI. Read GitHub documentation if you need help.

Cluster Status to submit:

HEALTH_WARN 1 MDSs report slow metadata IOs; Reduced data availability: 21 pgs inactive; 570 slow ops, oldest one blocked for 125431 sec, daemons [osd.1,osd.2,osd.4] have slow ops.

sh-4.4$ ceph status
  cluster:
    id:     ecf8035e-5899-4327-9a70-b86daac1f642
    health: HEALTH_WARN
            1 MDSs report slow metadata IOs
            Reduced data availability: 21 pgs inactive
            570 slow ops, oldest one blocked for 125447 sec, daemons [osd.1,osd.2,osd.4] have slow ops.
 
  services:
    mon: 1 daemons, quorum a (age 3d)
    mgr: a(active, since 114m)
    mds: 1/1 daemons up, 1 hot standby
    osd: 5 osds: 3 up (since 66m), 3 in (since 8h)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 49 pgs
    objects: 186 objects, 45 MiB
    usage:   347 MiB used, 150 GiB / 150 GiB avail
    pgs:     42.857% pgs unknown
             28 active+clean
             21 unknown

Environment:

OS (e.g. from /etc/os-release): NAME="Red Hat Enterprise Linux" VERSION="8.6 (Ootpa)"
Kernel (e.g. uname -a): Linux vm-kube-slave-6 4.18.0-372.19.1.el8_6.x86_64 #1 SMP Mon Jul 18 11:14:02 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration:
Rook version (use rook version inside of a Rook Pod): 1.9.7
Storage backend version (e.g. for ceph do ceph -v): filesystem
Kubernetes version (use kubectl version): 1.23
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): RKE
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):

About this issue

Original URL
State: closed
Created 2 years ago
Reactions: 1
Comments: 18 (14 by maintainers)

Commits related to this issue

osd: handle device name change and device removel correctly TBD Closes: https://github.com/rook/rook/issues/10860 Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com> — committed to cybozu-go/rook by satoru-takeuchi a year ago
osd: handle device name change and device removel correctly If a kernel device name change happens and a block device file in the OSD directory becomes missing, this OSD fails to start continuously. ... — committed to cybozu-go/rook by satoru-takeuchi a year ago
osd: handle device name change and device removel correctly If a kernel device name change happens and a block device file in the OSD directory becomes dangling link, this OSD fails to start continuo... — committed to cybozu-go/rook by satoru-takeuchi a year ago
osd: handle device name change and device removel correctly If a kernel device name change happens and a block device file in the OSD directory becomes dangling link, this OSD fails to start continuo... — committed to rook/rook by satoru-takeuchi a year ago
osd: handle device name change and device removel correctly If a kernel device name change happens and a block device file in the OSD directory becomes dangling link, this OSD fails to start continuo... — committed to rook/rook by satoru-takeuchi a year ago
osd: handle device name change and device removel correctly If a kernel device name change happens and a block device file in the OSD directory becomes dangling link, this OSD fails to start continuo... — committed to koor-tech/koor by satoru-takeuchi a year ago

Most upvoted comments

@travisn I’m testing #11567 , which resolves this issue. There are several remaining tests. I’ll finish this todat.

It takes long time due to lack of my extra time and there are many test case.

satoru-takeuchi on Feb 21, 2023

Thanks a lot for the fix, I’m just waiting for the next release 😃

lerminou on Feb 27, 2023

I’m still investigating this issue. This problem might be in ceph…

satoru-takeuchi on Sep 26, 2022