rook: Existing RBDs fail to be mounted after enabling network compression

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:

After enabling network compression in the CephCluster Object, Pods which have RDB volumes fail to start and are stuck in ContainerCreating. The following event is logged multiple times:

MountVolume.MountDevice failed for volume "pvc-a4d3a97f-64dc-444a-922c-3bd18eb51a33" : rpc error: code = Aborted desc = an operation with the given Volume ID 0001-0009-rook-ceph-0000000000000002-9e75fc1c-e669-11ec-ab60-0a580a82001b already exists

After disabling network compression again, the volumes can be mounted again and everything works fine.

Expected behavior:

Pod starts normally.

How to reproduce it (minimal and precise):

Cluster was updated from ceph version 16 to ceph version 17 and network compression was enabled.

Environment:

  • OS (e.g. from /etc/os-release): Fedora Linux CoreOS 35
  • Kernel (e.g. uname -a): 5.17.13-200.fc35.x86_64
  • Cloud provider or hardware configuration: 3 masters running as KVMs
  • Rook version (use rook version inside of a Rook Pod): v 1.9.4
  • Storage backend version (e.g. for ceph do ceph -v): ceph version 17.2.0 (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable)
  • Kubernetes version (use kubectl version): v1.23.5-rc.0.2071+3afdacbd018325-dirty
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): OKD
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_OK

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (9 by maintainers)

Most upvoted comments

Most of the common debugging tips are documented here https://rook.io/docs/rook/latest/Troubleshooting/ceph-csi-common-issues/

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.