rook: Block storage is stale when not unmounted

A rook block storage, that is not unmounted before the server is deleted, become stuck and cannot be deleted.

Repro: 1- Create pod with rook-client. I used this https://github.com/rook/rook/blob/master/demo/kubernetes/rook-client/ 2- Follow the step to create a block, mount it and write some data to it. 3- Delete the pod 4- Recreate the pod. 5- If you do rook block ls you will see the previous block . However you cannot do anything to it. If you try to mount it, it hangs. If you try to unmount it, it gives you errors.

# rook block unmount --path /tmp/rook-volume
failed to get device from mount point /tmp/rook-volume: <nil>
# rook block unmount --device rbd0
2017-02-08 00:23:31.707900 I | umount /dev/rbd0: umount: /dev/rbd0: not mounted

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 26 (15 by maintainers)

Most upvoted comments

Thank you, will keep an eye on that ticket

No. remove_single_major is a stable and documented interface, not a backdoor 😉 If there is no in-flight I/O, you can do:

$ sudo rbd unmap -o force /dev/rbd0
# wait for it to time out or send SIGINT
$ sudo umount /mnt

If there is in-flight I/O, we expect the cluster to come back. Currently there is nothing to tell krbd to throw out dirty pages – that’s what rbd unmap -o full-force is going to do.

@bassam rbd unmap -o force is specified as The driver will wait for running requests to complete and then unmap; requests sent to the driver after initiating the unmap will be failed. The plan is to add rbd unmap -o full-force that will abort in-flight requests, requests waiting for the exclusive lock, etc. This should allow to get rid of the rbd device in any circumstances.

If rbd device is not mounted, rbd unmap will time out after mount_timeout (default is 60) seconds on post-https://patchwork.kernel.org/patch/6403791/ kernels. On kernels that don’t have that patch, you should be able to just SIGINT or SIGKILL rbd unmap after giving it some time.

If rbd device is mounted, there is no good way to get rid of it. ceph clusters typically don’t come and go away completely, so this is just not something we’ve designed for. If there is no in-flight (likely stuck if the OSDs are gone) I/O, rbd unmap -o force should get you past -EBUSY to trying to unwatch, which again will time out after mount_timeout seconds. -o force is a fairly recent addition though (4.9 kernel).

There is no -o full-force or something like that for the case when there is stuck I/O (it’s on the TODO list). osd_request_timeout map option is there, but it is undocumented because it will time out any OSD request, which is dangerous if the cluster is severely degraded or just very slow.

My advice would be to do the proper umount/unmap cleanup. There is no reason at all to play with potentially dangerous timeout settings or force options if you can reliably detect the “cluster is about to go” event. There is nothing sane the rbd driver can do about the mounted filesystem, so even if there was a good way to get rid of the device itself, you’d still need to unmount the filesystem afterwards to free up the data structures and get your pod into a normal state.