rook: PVC mounts fail in v1.10.9 clusters when encryption is not enabled and kernel is older

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:

Expected behavior: I was able to mount PVC’s with this new EC pool, but it now seems I can’t with all PVC’s I have.

How to reproduce it (minimal and precise):

Make the following: https://github.com/rook/rook/blob/master/deploy/examples/csi/rbd/storageclass-ec.yaml https://github.com/rook/rook/blob/master/deploy/examples/pool-ec.yaml

Logs to submit:

Pod logs:

MountVolume.MountDevice failed for volume "pvc-9fda23c7-6ed9-4287-b04c-2cee7cd20a34" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.43.112.135:6789,10.43.80.29:6789,10.43.48.191:6789,10.43.136.249:6789,10.43.74.119:6789 --keyfile=***stripped*** map replicapool/csi-vol-c84112c7-943a-11ed-8dae-6edda849eb24 --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed rbd: map failed: (22) Invalid argument
  • Operator’s logs, if necessary

2023-01-15 12:29:56.339869 E \| ceph-block-pool-controller: failed to reconcile CephBlockPool "rook-ceph/ec-data-pool". failed to create pool "ec-data-pool".: failed to create pool "ec-data-pool".: failed to create pool "ec-data-pool": failed to create erasure code profile for pool "ec-data-pool": failed to look up default erasure code profile: failed to get erasure-code-profile for "default": exit status 1
--
Sun, Jan 15 2023 12:29:57 pm | 2023-01-15 12:29:57.318449 E \| ceph-block-pool-controller: failed to reconcile CephBlockPool "rook-ceph/replicated-metadata-pool". failed to create pool "replicated-metadata-pool".: failed to create pool "replicated-metadata-pool".: failed to create pool "replicated-metadata-pool": failed to create replicated crush rule "replicated-metadata-pool": failed to create crush rule replicated-metadata-pool: exit status 1
Sun, Jan 15 2023 12:29:58 pm | 2023-01-15 12:29:58.530240 E \| ceph-block-pool-controller: failed to reconcile CephBlockPool "rook-ceph/ec-data-pool". failed to create pool "ec-data-pool".: failed to create pool "ec-data-pool".: failed to create pool "ec-data-pool": failed to create erasure code profile for pool "ec-data-pool": failed to look up default erasure code profile: failed to get erasure-code-profile for "default": exit status 1
```


**Cluster Status to submit**:
```
  cluster:
    id:     cb794671-ea06-4e89-8661-ce00ba0134d5
    health: HEALTH_WARN
            clock skew detected on mon.bl
            mon bc is low on available space
            40 daemons have recently crashed
            2 mgr modules have recently crashed
 
  services:
    mon: 5 daemons, quorum ay,bc,bf,bg,bl (age 2h)
    mgr: b(active, since 50m), standbys: a
    mds: 2/2 daemons up, 2 hot standby
    osd: 6 osds: 6 up (since 35m), 6 in (since 17h); 13 remapped pgs
 
  data:
    volumes: 2/2 healthy
    pools:   11 pools, 88 pgs
    objects: 3.87M objects, 984 GiB
    usage:   2.8 TiB used, 2.5 TiB / 5.3 TiB avail
    pgs:     603009/11623113 objects misplaced (5.188%)
             75 active+clean
             11 active+remapped+backfilling
             2  active+clean+remapped
 
  io:
    client:   1.6 MiB/s rd, 2.7 MiB/s wr, 620 op/s rd, 172 op/s wr
    recovery: 848 KiB/s, 14 objects/s
 
  progress:
    Global Recovery Event (45m)
      [========================....] (remaining: 6m)
```

**Environment**:
* OS (e.g. from /etc/os-release): Ubuntu 21
* Kernel (e.g. `uname -a`): 20.04.1-Ubuntu
* Cloud provider or hardware configuration:
* Rook version (use `rook version` inside of a Rook Pod): rook: v1.10.8
* Storage backend version (e.g. for ceph do `ceph -v`): ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable)
* Kubernetes version (use `kubectl version`): 1.23
* Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): RKE 1
* Storage backend status (e.g. for Ceph use `ceph health` in the [Rook Ceph toolbox](https://rook.io/docs/rook/latest-release/Troubleshooting/ceph-toolbox/#interactive-toolbox)): HEALTH_WARN clock skew detected on mon.bl; mon bc is low on available space; 40 daemons have recently crashed; 2 mgr modules have recently crashed

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 1
  • Comments: 18 (5 by maintainers)

Most upvoted comments

@cayla, the above default rbd map option ms_mode=prefer-crc will be set only if you have encryption enabled, not for normal clusters.

@Madhu-1 , I think the code at line 605 below is not checking .Enabled settings but existence of c.Spec.Network.Connections.Encryption setting.

@travisn is the the intention to enable this by default and have users remove c.Spec.Network.Connections.Encryption setting if not required ?

I think earlier kernel versions do not have support libceph: bad option at 'ms_mode=prefer-crc' as mentioned by @cayla

https://github.com/rook/rook/blob/91beb549be3720a1278c3b5c67934f46555e1db5/pkg/operator/ceph/cluster/cluster.go#L605-L625

Update: upgrading the kernel to 5.11+ (5.15 in my case) resolved the RBD mounting issue described in this comment.

Ahha

It’s strange, I redeployed it on a different node and it didn’t throw any error. They had differing kernel’s: Working one: Linux worker2 5.11.0-49-generic https://github.com/rook/rook/issues/55-Ubuntu SMP Wed Jan 12 17:36:34 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux Non-working: Linux worker1 5.4.0-136-generic https://github.com/rook/rook/pull/153-Ubuntu SMP Thu Nov 24 15:56:58 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Thanks for confirming the theory I am working on right now.

I had the same issue as you after upgrading from 1.9.10 to 1.10.9.

I suspect it was triggered (intentionally or not) by https://github.com/rook/rook/pull/11523

I was digging around and saw this comment:

https://github.com/rook/rook/blob/ae408e354443ab8af9ade5768371e7ac82c1233c/deploy/examples/cluster.yaml#L78-L86

Even though encryption isn’t enabled by default, I think this is now a hard requirement.

I am installing linux-generic-hwe-20.04 (5.15.0-58-generic) on my nodes and rebooting to confirm it fixes the issue, but your comment gives me a lot of hope.


FWIW, this was my debugging path:

New PVCs were failing to mount. Existing, mounted PVCs were fine.

New pods showed the following event:

MountVolume.MountDevice failed for volume "pvc-8c3d89f2-577d-41fd-a812-45bcbfa321de" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.108.70.247:6789,10.107.78.45:6789,10.103.81.145:6789 --keyfile=***stripped*** map replicapool/csi-vol-ff706135-94ec-11ed-aec5-5607db29eeaf --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument

This was echo’d in the csi-rbdplugin:

Warning  FailedMount             13s (x6 over 30s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-8c3d89f2-577d-41fd-a812-45bcbfa321de" : rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.103.81.145:6789,10.108.70.247:6789,10.107.78.45:6789 --keyfile=***strip
ped*** map replicapool/csi-vol-ff706135-94ec-11ed-aec5-5607db29eeaf --device-type krbd --options noudev], rbd error output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument

dmesg on the node showed:

libceph: bad option at 'ms_mode=prefer-crc'

Googling for that led to me to the aforementioned merged PR.

I was trying to reckon how this config passed down and saw:

bash-4.4$ ceph config dump | grep crc
global                basic     ms_client_mode                         crc secure          *
global                basic     ms_cluster_mode                        crc secure          *
global                basic     ms_service_mode                        crc secure          *
global                advanced  rbd_default_map_options                ms_mode=prefer-crc  *
bash-4.4$

and finally made my way back to the cluster.yaml where I saw the comment about the kernel version and inspiration struck.

I am not a rook or ceph expert by any means, but writing it all out here for anyone else that stumbles on this.

@Rakshith-R Agreed that you’ve identified the issue. Due to that setting not being available in older kernels, we should not set that if encryption is not enabled. However, if encryption had been enabled and now is being disabled, rook should remove that setting from the mon store. So it seems that if encryption is not enabled, rook needs to query the mon store to see if it was set and needs to be disabled.

I noted it in an edit above, but in case it was missed, upgrading the kernel to 5.11+ completely resolved my issues. I have a fat and happy ceph cluster once again.

Great to hear. 😃 I will be either installing that dependency or upgrading kernel (bigger undertaking ATM). I am hoping someone can chime in about the Rook Operating errors regarding my EC pool that isn’t reconciling… 😄

Can confirm the issue around mounting was resolved with installing the linux-generic-hwe-20.04 dependency… 😃

I noted it in an edit above, but in case it was missed, upgrading the kernel to 5.11+ completely resolved my issues. I have a fat and happy ceph cluster once again.

I am not using EC, so yeah, it being multiple issues is a distinct possibility.