rook: CephFS PV provisioning fails with: error getting omap value cephfs-metadata: Operation not permitted

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior: CSI CephFS provisioner fails to provision a Persistent Volume after upgrading from Rook Ceph Operator 1.1.2 to 1.1.7:

E1213 09:19:04.687605       1 controller.go:908] error syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf": failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-721687408 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted

Permissions seem to look good:

client.csi-cephfs-node
        key: <redacted>
        caps: [mds] allow rw
        caps: [mgr] allow rw
        caps: [mon] allow r
        caps: [osd] allow rw tag cephfs *=*
client.csi-cephfs-provisioner
        key: <redacted>
        caps: [mgr] allow rw
        caps: [mon] allow r
        caps: [osd] allow rw tag cephfs metadata=*
client.csi-rbd-node
        key: <redacted>
        caps: [mon] profile rbd
        caps: [osd] profile rbd
client.csi-rbd-provisioner
        key: <redacted>
        caps: [mgr] allow rw
        caps: [mon] profile rbd
        caps: [osd] profile rbd

We do not see this issue on clusters that were initially deployed with 1.1.7.

Expected behavior: Provisioner should have sufficient permissions to complete provisioning.

How to reproduce it (minimal and precise):

Setup cluster with Rook Ceph operator 1.1.2
Upgrade to 1.1.7
Create a PVC

File(s) to submit:

CSI provisioner logs:

I1213 09:19:01.273122       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:03.987819       1 controller.go:1196] provision "grafana/grafana" class "rook-cephfs": started
I1213 09:19:03.990154       1 controller.go:471] CreateVolumeRequest {Name:pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" mount_flags:"_netdev" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:cephfs mounter:kernel pool:cephfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1213 09:19:03.990231       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "grafana/grafana"
I1213 09:19:03.992311       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1213 09:19:03.992329       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf","parameters":{"clusterID":"rook-ceph","fsName":"cephfs","mounter":"kernel","pool":"cephfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["_netdev"]}},"access_mode":{"mode":5}}]}
I1213 09:19:04.687152       1 connection.go:183] GRPC response: {}
I1213 09:19:04.687497       1 connection.go:184] GRPC error: rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-721687408 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:04.687561       1 controller.go:979] Final error received, removing PVC 4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf from claims in progress
W1213 09:19:04.687578       1 controller.go:886] Retrying syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", failure 5
E1213 09:19:04.687605       1 controller.go:908] error syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf": failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-721687408 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:04.687647       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-721687408 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:06.285990       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:11.292517       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:16.300329       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:20.687803       1 controller.go:1196] provision "grafana/grafana" class "rook-cephfs": started
I1213 09:19:20.689671       1 controller.go:471] CreateVolumeRequest {Name:pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" mount_flags:"_netdev" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:cephfs mounter:kernel pool:cephfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1213 09:19:20.689844       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "grafana/grafana"
I1213 09:19:20.691369       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1213 09:19:20.691400       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf","parameters":{"clusterID":"rook-ceph","fsName":"cephfs","mounter":"kernel","pool":"cephfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["_netdev"]}},"access_mode":{"mode":5}}]}
I1213 09:19:21.305960       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:21.379717       1 connection.go:183] GRPC response: {}
I1213 09:19:21.380059       1 connection.go:184] GRPC error: rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-600923417 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:21.380095       1 controller.go:979] Final error received, removing PVC 4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf from claims in progress
W1213 09:19:21.380113       1 controller.go:886] Retrying syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", failure 6
E1213 09:19:21.380130       1 controller.go:908] error syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf": failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-600923417 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:21.380194       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-600923417 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:26.312842       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:31.318273       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:36.324250       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:41.331747       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:46.337321       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:51.342860       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1213 09:19:53.380377       1 controller.go:1196] provision "grafana/grafana" class "rook-cephfs": started
I1213 09:19:53.382758       1 controller.go:471] CreateVolumeRequest {Name:pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf CapacityRange:required_bytes:10737418240  VolumeCapabilities:[mount:<fs_type:"ext4" mount_flags:"_netdev" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > ] Parameters:map[clusterID:rook-ceph csi.storage.k8s.io/node-stage-secret-name:rook-csi-cephfs-node csi.storage.k8s.io/node-stage-secret-namespace:rook-ceph csi.storage.k8s.io/provisioner-secret-name:rook-csi-cephfs-provisioner csi.storage.k8s.io/provisioner-secret-namespace:rook-ceph fsName:cephfs mounter:kernel pool:cephfs-data0] Secrets:map[] VolumeContentSource:<nil> AccessibilityRequirements:<nil> XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I1213 09:19:53.382890       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "grafana/grafana"
I1213 09:19:53.385396       1 connection.go:180] GRPC call: /csi.v1.Controller/CreateVolume
I1213 09:19:53.385406       1 connection.go:181] GRPC request: {"capacity_range":{"required_bytes":10737418240},"name":"pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf","parameters":{"clusterID":"rook-ceph","fsName":"cephfs","mounter":"kernel","pool":"cephfs-data0"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["_netdev"]}},"access_mode":{"mode":5}}]}
I1213 09:19:54.092913       1 connection.go:183] GRPC response: {}
I1213 09:19:54.093316       1 connection.go:184] GRPC error: rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-070683766 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:54.093411       1 controller.go:979] Final error received, removing PVC 4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf from claims in progress
W1213 09:19:54.093450       1 controller.go:886] Retrying syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", failure 7
E1213 09:19:54.093486       1 controller.go:908] error syncing claim "4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf": failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-070683766 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:54.093529       1 event.go:209] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"grafana", Name:"grafana", UID:"4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf", APIVersion:"v1", ResourceVersion:"32714240", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "rook-cephfs": rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.203.145.41:6789,10.203.117.78:6789,10.203.9.176:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p cephfs-metadata getomapval csi.volumes.default csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf /tmp/omap-get-070683766 --namespace=csi]) occurred, command output streams is ( error getting omap value cephfs-metadata/csi.volumes.default/csi.volume.pvc-4fd34480-8bb0-4b00-b5d9-ad9925b5d0bf: (1) Operation not permitted
)
I1213 09:19:56.349251       1 leaderelection.go:258] successfully renewed lease rook-ceph/rook-ceph-cephfs-csi-ceph-com

Operator logs:

2019-12-13 08:34:17.517296 I | rookcmd: starting Rook v1.1.7 with arguments '/usr/local/bin/rook ceph operator'
2019-12-13 08:34:17.517371 I | rookcmd: flag values: --add_dir_header=false, --alsologtostderr=false, --csi-attacher-image=repo.starworld.local:50443/quay.io/k8scsi/csi-attacher:v1.2.0, --csi-ceph-image=repo.starworld.local:50443/quay.io/cephcsi/cephcsi:canary, --csi-cephfs-plugin-template-path=/etc/ceph-csi/cephfs/csi-cephfsplugin.yaml, --csi-cephfs-provisioner-dep-template-path=/etc/ceph-csi/cephfs/csi-cephfsplugin-provisioner-dep.yaml, --csi-cephfs-provisioner-sts-template-path=/etc/ceph-csi/cephfs/csi-cephfsplugin-provisioner-sts.yaml, --csi-driver-name-prefix=, --csi-enable-cephfs=true, --csi-enable-grpc-metrics=true, --csi-enable-rbd=true, --csi-kubelet-dir-path=/var/lib/kubelet, --csi-provisioner-image=repo.starworld.local:50443/quay.io/k8scsi/csi-provisioner:v1.3.0, --csi-rbd-plugin-template-path=/etc/ceph-csi/rbd/csi-rbdplugin.yaml, --csi-rbd-provisioner-dep-template-path=/etc/ceph-csi/rbd/csi-rbdplugin-provisioner-dep.yaml, --csi-rbd-provisioner-sts-template-path=/etc/ceph-csi/rbd/csi-rbdplugin-provisioner-sts.yaml, --csi-registrar-image=repo.starworld.local:50443/quay.io/k8scsi/csi-node-driver-registrar:v1.1.0, --csi-snapshotter-image=repo.starworld.local:50443/quay.io/k8scsi/csi-snapshotter:v1.2.0, --enable-discovery-daemon=true, --enable-flex-driver=false, --enable-machine-disruption-budget=false, --help=false, --log-flush-frequency=5s, --log-level=INFO, --log_backtrace_at=:0, --log_dir=, --log_file=, --log_file_max_size=1800, --logtostderr=true, --mon-healthcheck-interval=45s, --mon-out-timeout=10m0s, --operator-image=, --service-account=, --skip_headers=false, --skip_log_headers=false, --stderrthreshold=2, --v=0, --vmodule=
2019-12-13 08:34:17.517376 I | cephcmd: starting operator
2019-12-13 08:34:17.590893 I | op-discover: rook-discover daemonset already exists, updating ...
2019-12-13 08:34:17.597199 I | operator: rook-provisioner rook.io/block started using rook.io flex vendor dir
I1213 08:34:17.597521       6 leaderelection.go:217] attempting to acquire leader lease  rook-ceph/rook.io-block...
2019-12-13 08:34:17.597987 I | operator: rook-provisioner ceph.rook.io/block started using ceph.rook.io flex vendor dir
2019-12-13 08:34:17.597995 I | operator: Watching all namespaces for cluster CRDs
2019-12-13 08:34:17.598000 I | op-cluster: start watching clusters in all namespaces
2019-12-13 08:34:17.598028 I | op-cluster: Enabling hotplug orchestration: ROOK_DISABLE_DEVICE_HOTPLUG=false
I1213 08:34:17.598112       6 leaderelection.go:217] attempting to acquire leader lease  rook-ceph/ceph.rook.io-block...
2019-12-13 08:34:17.598150 I | operator: setting up the controller-runtime manager
2019-12-13 08:34:17.601477 I | op-cluster: starting cluster in namespace rook-ceph
2019-12-13 08:34:17.675156 I | operator: starting the controller-runtime manager
2019-12-13 08:34:18.177372 I | operator: successfully started Ceph CSI driver(s)
2019-12-13 08:34:24.177515 I | op-cluster: detecting the ceph image version for image repo.starworld.local:50443/ceph/ceph:v14.2.4-20191112...
2019-12-13 08:34:25.624344 I | op-cluster: Detected ceph image version: 14.2.4 nautilus
2019-12-13 08:34:25.630247 I | op-mon: parsing mon endpoints: a=10.203.145.41:6789,b=10.203.117.78:6789,c=10.203.9.176:6789
2019-12-13 08:34:25.630305 I | op-mon: loaded: maxMonID=2, mons=map[a:0xc000a945a0 b:0xc000a94640 c:0xc000a94740], mapping=&{Node:map[a:0xc000a92d80 b:0xc000a92e10 c:0xc000a92e40]}
2019-12-13 08:34:25.630638 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-12-13 08:34:25.630692 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-12-13 08:34:25.630763 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/215464797
2019-12-13 08:34:26.216036 I | op-cluster: CephCluster rook-ceph status: Creating.
2019-12-13 08:34:26.226783 I | op-mon: start running mons
2019-12-13 08:34:26.275531 I | op-mon: parsing mon endpoints: a=10.203.145.41:6789,b=10.203.117.78:6789,c=10.203.9.176:6789
2019-12-13 08:34:26.275575 I | op-mon: loaded: maxMonID=2, mons=map[a:0xc000332b20 b:0xc000332b60 c:0xc000332c40], mapping=&{Node:map[a:0xc000695da0 b:0xc000695dd0 c:0xc000695e00]}
2019-12-13 08:34:26.282207 I | op-mon: saved mon endpoints to config map map[data:a=10.203.145.41:6789,b=10.203.117.78:6789,c=10.203.9.176:6789 maxMonId:2 mapping:{"node":{"a":{"Name":"worker-04","Hostname":"worker-04","Address":"172.28.180.222"},"b":{"Name":"worker-02","Hostname":"worker-02","Address":"172.28.180.212"},"c":{"Name":"worker-03","Hostname":"worker-03","Address":"172.28.180.221"}}} csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.203.145.41:6789","10.203.117.78:6789","10.203.9.176:6789"]}]]
2019-12-13 08:34:26.287504 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-12-13 08:34:26.287595 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-12-13 08:34:27.216953 I | op-mon: targeting the mon count 3
2019-12-13 08:34:27.217159 I | exec: Running command: ceph config set global mon_allow_pool_delete true --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/606045464
2019-12-13 08:34:27.809235 I | exec: Running command: ceph config set global bluestore_warn_on_legacy_statfs false --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/691188119
2019-12-13 08:34:28.489728 I | exec: Running command: ceph config set global rbd_default_features 3 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/107169802
2019-12-13 08:34:29.108144 I | op-mon: checking for basic quorum with existing mons
2019-12-13 08:34:29.125850 I | op-mon: mon b endpoint are [v2:10.203.117.78:3300,v1:10.203.117.78:6789]
2019-12-13 08:34:29.140268 I | op-mon: mon c endpoint are [v2:10.203.9.176:3300,v1:10.203.9.176:6789]
2019-12-13 08:34:29.217000 I | op-mon: mon a endpoint are [v2:10.203.145.41:3300,v1:10.203.145.41:6789]
2019-12-13 08:34:29.616082 I | op-mon: saved mon endpoints to config map map[data:a=10.203.145.41:6789,b=10.203.117.78:6789,c=10.203.9.176:6789 maxMonId:2 mapping:{"node":{"a":{"Name":"worker-04","Hostname":"worker-04","Address":"172.28.180.222"},"b":{"Name":"worker-02","Hostname":"worker-02","Address":"172.28.180.212"},"c":{"Name":"worker-03","Hostname":"worker-03","Address":"172.28.180.221"}}} csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.203.145.41:6789","10.203.117.78:6789","10.203.9.176:6789"]}]]
2019-12-13 08:34:30.015855 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-12-13 08:34:30.015964 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-12-13 08:34:30.816503 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2019-12-13 08:34:30.816601 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2019-12-13 08:34:30.839212 I | op-mon: deployment for mon rook-ceph-mon-b already exists. updating if needed
2019-12-13 08:34:30.841725 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:30.841739 I | op-k8sutil: updating deployment rook-ceph-mon-b
I1213 08:34:32.722115       6 leaderelection.go:227] successfully acquired lease rook-ceph/ceph.rook.io-block
I1213 08:34:32.722208       6 controller.go:769] Starting provisioner controller ceph.rook.io/block_rook-ceph-operator-675898649d-5928x_59fe2b13-1d83-11ea-80b2-9a9e2a62d952!
I1213 08:34:32.722224       6 event.go:209] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"rook-ceph", Name:"ceph.rook.io-block", UID:"a41f5efe-a8fc-43e4-b0d1-d14b990cf90b", APIVersion:"v1", ResourceVersion:"32690993", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' rook-ceph-operator-675898649d-5928x_59fe2b13-1d83-11ea-80b2-9a9e2a62d952 became leader
I1213 08:34:32.822368       6 controller.go:818] Started provisioner controller ceph.rook.io/block_rook-ceph-operator-675898649d-5928x_59fe2b13-1d83-11ea-80b2-9a9e2a62d952!
2019-12-13 08:34:32.861789 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-b
2019-12-13 08:34:32.861803 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:32.861812 I | op-mon: waiting for mon quorum with [b c a]
2019-12-13 08:34:32.872859 I | op-mon: mons running: [b c a]
2019-12-13 08:34:32.872943 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/865440737
2019-12-13 08:34:33.583996 I | op-mon: Monitors in quorum: [a b c]
2019-12-13 08:34:33.589246 I | op-mon: deployment for mon rook-ceph-mon-c already exists. updating if needed
2019-12-13 08:34:33.591464 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:33.591479 I | op-k8sutil: updating deployment rook-ceph-mon-c
I1213 08:34:33.922180       6 leaderelection.go:227] successfully acquired lease rook-ceph/rook.io-block
I1213 08:34:33.922310       6 controller.go:769] Starting provisioner controller rook.io/block_rook-ceph-operator-675898649d-5928x_59fe1767-1d83-11ea-80b2-9a9e2a62d952!
I1213 08:34:33.922427       6 event.go:209] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"rook-ceph", Name:"rook.io-block", UID:"66ada700-8b9d-40b9-8f9d-1c5c32c4de12", APIVersion:"v1", ResourceVersion:"32691011", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' rook-ceph-operator-675898649d-5928x_59fe1767-1d83-11ea-80b2-9a9e2a62d952 became leader
I1213 08:34:34.022434       6 controller.go:818] Started provisioner controller rook.io/block_rook-ceph-operator-675898649d-5928x_59fe1767-1d83-11ea-80b2-9a9e2a62d952!
2019-12-13 08:34:35.607847 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-c
2019-12-13 08:34:35.607862 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:35.607871 I | op-mon: waiting for mon quorum with [b c a]
2019-12-13 08:34:35.621029 I | op-mon: mons running: [b c a]
2019-12-13 08:34:35.621129 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/380212172
2019-12-13 08:34:36.211550 I | op-mon: Monitors in quorum: [a b c]
2019-12-13 08:34:36.214174 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2019-12-13 08:34:36.215894 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:36.215906 I | op-k8sutil: updating deployment rook-ceph-mon-a
2019-12-13 08:34:38.229611 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mon-a
2019-12-13 08:34:38.229626 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:38.229636 I | op-mon: waiting for mon quorum with [b c a]
2019-12-13 08:34:38.243769 I | op-mon: mons running: [b c a]
2019-12-13 08:34:38.243875 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/493002939
2019-12-13 08:34:38.876880 I | op-mon: Monitors in quorum: [a b c]
2019-12-13 08:34:38.876900 I | op-mon: mons created: 3
2019-12-13 08:34:38.876949 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/013067486
2019-12-13 08:34:39.500188 I | op-mon: waiting for mon quorum with [b c a]
2019-12-13 08:34:39.513907 I | op-mon: mons running: [b c a]
2019-12-13 08:34:39.513985 I | exec: Running command: ceph mon_status --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/456651429
2019-12-13 08:34:40.187756 I | op-mon: Monitors in quorum: [a b c]
2019-12-13 08:34:40.187840 I | exec: Running command: ceph auth get-or-create-key client.csi-rbd-provisioner mon profile rbd mgr allow rw osd profile rbd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/975617472
2019-12-13 08:34:40.803631 I | exec: Running command: ceph auth get-or-create-key client.csi-rbd-node mon profile rbd osd profile rbd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/501466143
2019-12-13 08:34:41.405047 I | exec: Running command: ceph auth get-or-create-key client.csi-cephfs-provisioner mon allow r mgr allow rw osd allow rw tag cephfs metadata=* --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/195441394
2019-12-13 08:34:42.018147 I | exec: Running command: ceph auth get-or-create-key client.csi-cephfs-node mon allow r mgr allow rw osd allow rw tag cephfs *=* mds allow rw --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/766385065
2019-12-13 08:34:42.696936 I | ceph-csi: created kubernetes csi secrets for cluster "rook-ceph"
2019-12-13 08:34:42.697019 I | exec: Running command: ceph version --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/906303732
2019-12-13 08:34:43.375789 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/128899011
2019-12-13 08:34:44.003765 I | exec: Running command: ceph mon enable-msgr2 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/757451846
2019-12-13 08:34:44.606006 I | cephclient: successfully enabled msgr2 protocol
2019-12-13 08:34:44.606063 I | op-mgr: start running mgr
2019-12-13 08:34:44.606135 I | exec: Running command: ceph auth get-or-create-key mgr.a mon allow * mds allow * osd allow * --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/682225901
2019-12-13 08:34:45.223907 I | exec: Running command: ceph config get mgr.a mgr/dashboard/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/153139048
2019-12-13 08:34:45.887406 I | exec: Running command: ceph config rm mgr.a mgr/dashboard/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/521793959
2019-12-13 08:34:46.517644 I | op-mgr: clearing http bind fix mod=dashboard ver=13.0.0 mimic changed=true err=<nil>
2019-12-13 08:34:46.517716 I | exec: Running command: ceph config get mgr.a mgr/dashboard/a/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/766510298
2019-12-13 08:34:47.191152 I | exec: Running command: ceph config rm mgr.a mgr/dashboard/a/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/697831025
2019-12-13 08:34:47.875750 I | op-mgr: clearing http bind fix mod=dashboard ver=13.0.0 mimic changed=true err=<nil>
2019-12-13 08:34:47.875819 I | exec: Running command: ceph config get mgr.a mgr/prometheus/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/475458332
2019-12-13 08:34:48.506772 I | exec: Running command: ceph config rm mgr.a mgr/prometheus/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/979679691
2019-12-13 08:34:49.191090 I | op-mgr: clearing http bind fix mod=prometheus ver=13.0.0 mimic changed=false err=<nil>
2019-12-13 08:34:49.191157 I | exec: Running command: ceph config get mgr.a mgr/prometheus/a/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/427556526
2019-12-13 08:34:49.817050 I | exec: Running command: ceph config rm mgr.a mgr/prometheus/a/server_addr --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/998930485
2019-12-13 08:34:50.499468 I | op-mgr: clearing http bind fix mod=prometheus ver=13.0.0 mimic changed=false err=<nil>
2019-12-13 08:34:50.507042 I | op-mgr: deployment for mgr rook-ceph-mgr-a already exists. updating if needed
2019-12-13 08:34:50.508729 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:50.508796 I | op-k8sutil: updating deployment rook-ceph-mgr-a
2019-12-13 08:34:52.518512 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mgr-a
2019-12-13 08:34:52.518528 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:34:52.518607 I | exec: Running command: ceph mgr module enable orchestrator_cli --force --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/730410000
2019-12-13 08:34:53.688486 I | exec: module 'orchestrator_cli' is already enabled (always-on)
2019-12-13 08:34:53.688802 I | exec: Running command: ceph mgr module enable rook --force --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/736740911
2019-12-13 08:34:54.703556 I | exec: module 'rook' is already enabled
2019-12-13 08:34:54.703686 I | exec: Running command: ceph orchestrator set backend rook --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/681129410
2019-12-13 08:34:55.383827 I | exec: Running command: ceph mgr module enable prometheus --force --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/082590777
2019-12-13 08:34:56.700432 I | exec: module 'prometheus' is already enabled
2019-12-13 08:34:56.700573 I | exec: Running command: ceph mgr module enable dashboard --force --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/113779780
2019-12-13 08:34:57.706786 I | exec: module 'dashboard' is already enabled
2019-12-13 08:35:02.708870 I | op-mgr: the dashboard secret was already generated
2019-12-13 08:35:02.708886 I | op-mgr: Running command: ceph dashboard set-login-credentials admin *******
2019-12-13 08:35:03.549849 I | op-mgr: restarting the mgr module
2019-12-13 08:35:03.549926 I | exec: Running command: ceph mgr module disable dashboard --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/216481302
2019-12-13 08:35:04.852116 I | exec: Running command: ceph mgr module enable dashboard --force --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/386915965
2019-12-13 08:35:05.824915 I | exec: Running command: ceph config get mgr.a mgr/dashboard/url_prefix --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/964510648
2019-12-13 08:35:06.413545 I | exec: Running command: ceph config rm mgr.a mgr/dashboard/url_prefix --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/454489527
2019-12-13 08:35:07.097038 I | exec: Running command: ceph config get mgr.a mgr/dashboard/ssl --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/244251050
2019-12-13 08:35:07.778902 I | exec: Running command: ceph config set mgr.a mgr/dashboard/ssl false --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/597200641
2019-12-13 08:35:08.410415 I | exec: Running command: ceph config get mgr.a mgr/dashboard/server_port --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/784108652
2019-12-13 08:35:09.092528 I | exec: Running command: ceph config set mgr.a mgr/dashboard/server_port 7000 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/295098075
2019-12-13 08:35:09.730057 I | op-mgr: dashboard service already exists
2019-12-13 08:35:09.744063 I | op-mgr: mgr metrics service already exists
2019-12-13 08:35:09.744081 I | op-osd: start running osds in namespace rook-ceph
2019-12-13 08:35:09.744085 I | op-osd: start provisioning the osds on pvcs, if needed
2019-12-13 08:35:09.744088 I | op-osd: no volume sources defined to configure OSDs on PVCs.
2019-12-13 08:35:09.744090 I | op-osd: start provisioning the osds on nodes, if needed
2019-12-13 08:35:09.747985 I | op-osd: 4 of the 4 storage nodes are valid
2019-12-13 08:35:09.793433 I | op-osd: osd provision job started for node worker-01
2019-12-13 08:35:09.804045 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-worker-02 to start a new one
2019-12-13 08:35:09.827080 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-02 still exists
2019-12-13 08:35:11.828849 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-02 deleted
2019-12-13 08:35:11.834188 I | op-osd: osd provision job started for node worker-02
2019-12-13 08:35:11.850301 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-worker-03 to start a new one
2019-12-13 08:35:11.863033 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-03 still exists
2019-12-13 08:35:13.864982 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-03 deleted
2019-12-13 08:35:13.870782 I | op-osd: osd provision job started for node worker-03
2019-12-13 08:35:13.892812 I | op-k8sutil: Removing previous job rook-ceph-osd-prepare-worker-04 to start a new one
2019-12-13 08:35:13.913622 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-04 still exists
2019-12-13 08:35:15.918785 I | op-k8sutil: batch job rook-ceph-osd-prepare-worker-04 deleted
2019-12-13 08:35:15.925485 I | op-osd: osd provision job started for node worker-04
2019-12-13 08:35:15.925583 I | op-osd: start osds after provisioning is completed, if needed
2019-12-13 08:35:15.929040 I | op-osd: osd orchestration status for node worker-01 is completed
2019-12-13 08:35:15.929052 I | op-osd: starting 1 osd daemons on node worker-01
2019-12-13 08:35:15.941258 I | op-osd: started deployment for osd 0 (dir=false, type=bluestore)
2019-12-13 08:35:15.979434 I | op-osd: osd orchestration status for node worker-02 is completed
2019-12-13 08:35:15.979491 I | op-osd: starting 1 osd daemons on node worker-02
2019-12-13 08:35:16.000361 I | op-osd: deployment for osd 2 already exists. updating if needed
2019-12-13 08:35:16.007373 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:16.007388 I | op-k8sutil: updating deployment rook-ceph-osd-2
2019-12-13 08:35:18.028927 I | op-k8sutil: finished waiting for updated deployment rook-ceph-osd-2
2019-12-13 08:35:18.028942 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:18.028947 I | op-osd: started deployment for osd 2 (dir=false, type=bluestore)
2019-12-13 08:35:18.034151 I | op-osd: osd orchestration status for node worker-03 is orchestrating
2019-12-13 08:35:18.034176 I | op-osd: osd orchestration status for node worker-04 is starting
2019-12-13 08:35:18.034180 I | op-osd: 2/4 node(s) completed osd provisioning, resource version 32691459
2019-12-13 08:35:18.036064 I | op-osd: osd orchestration status for node worker-03 is completed
2019-12-13 08:35:18.036077 I | op-osd: starting 1 osd daemons on node worker-03
2019-12-13 08:35:18.042173 I | op-osd: deployment for osd 1 already exists. updating if needed
2019-12-13 08:35:18.044150 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:18.044163 I | op-k8sutil: updating deployment rook-ceph-osd-1
2019-12-13 08:35:20.054536 I | op-k8sutil: finished waiting for updated deployment rook-ceph-osd-1
2019-12-13 08:35:20.054551 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:20.054556 I | op-osd: started deployment for osd 1 (dir=false, type=bluestore)
2019-12-13 08:35:20.059133 I | op-osd: osd orchestration status for node worker-04 is computingDiff
2019-12-13 08:35:20.059319 I | op-osd: osd orchestration status for node worker-04 is orchestrating
2019-12-13 08:35:20.059580 I | op-osd: osd orchestration status for node worker-04 is completed
2019-12-13 08:35:20.059591 I | op-osd: starting 1 osd daemons on node worker-04
2019-12-13 08:35:20.066453 I | op-osd: deployment for osd 3 already exists. updating if needed
2019-12-13 08:35:20.068400 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:20.068412 I | op-k8sutil: updating deployment rook-ceph-osd-3
2019-12-13 08:35:22.082614 I | op-k8sutil: finished waiting for updated deployment rook-ceph-osd-3
2019-12-13 08:35:22.082629 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:22.082634 I | op-osd: started deployment for osd 3 (dir=false, type=bluestore)
2019-12-13 08:35:22.087181 I | op-osd: 4/4 node(s) completed osd provisioning
2019-12-13 08:35:22.087241 I | op-osd: checking if any nodes were removed
2019-12-13 08:35:22.094995 I | op-osd: processing 0 removed nodes
2019-12-13 08:35:22.095008 I | op-osd: done processing removed nodes
2019-12-13 08:35:22.095086 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/741919870
2019-12-13 08:35:22.711588 I | exec: Running command: ceph osd require-osd-release nautilus --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/022996421
2019-12-13 08:35:23.311502 I | cephclient: successfully disallowed pre-nautilus osds and enabled all new nautilus-only functionality
2019-12-13 08:35:23.311522 I | op-osd: completed running osds in namespace rook-ceph
2019-12-13 08:35:23.311530 I | rbd-mirror: configure rbd-mirroring with 0 workers
2019-12-13 08:35:23.314262 I | rbd-mirror: no extra daemons to remove
2019-12-13 08:35:23.314276 I | op-cluster: Done creating rook instance in namespace rook-ceph
2019-12-13 08:35:23.314286 I | op-cluster: CephCluster rook-ceph status: Created.
2019-12-13 08:35:23.322329 I | op-pool: start watching clusters in all namespaces
2019-12-13 08:35:23.322344 I | op-object: start watching object store resources in namespace rook-ceph
2019-12-13 08:35:23.322350 I | op-object: start watching object store user resources in namespace rook-ceph
2019-12-13 08:35:23.322355 I | op-bucket-prov: Ceph Bucket Provisioner launched
2019-12-13 08:35:23.323060 I | op-file: start watching filesystem resource in namespace rook-ceph
2019-12-13 08:35:23.323075 I | op-nfs: start watching ceph nfs resource in namespace rook-ceph
2019-12-13 08:35:23.323086 I | op-cluster: ceph status check interval is 60s
I1213 08:35:23.323125       6 manager.go:98] objectbucket.io/provisioner-manager "level"=0 "msg"="starting provisioner"  "name"="ceph.rook.io/bucket"
2019-12-13 08:35:23.324698 I | exec: Running command: ceph osd crush dump --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/871214655
2019-12-13 08:35:23.374360 I | op-cluster: finalizer already set on cluster rook-ceph
2019-12-13 08:35:23.374449 I | exec: Running command: ceph osd crush dump --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/743263890
2019-12-13 08:35:25.192808 I | op-pool: creating pool rbdpool in namespace rook-ceph
2019-12-13 08:35:25.192930 I | exec: Running command: ceph osd crush rule create-replicated rbdpool default host --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/039459017
2019-12-13 08:35:25.277739 I | exec: Running command: ceph osd crush dump --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/433819028
2019-12-13 08:35:26.392782 I | exec: Running command: ceph fs get cephfs --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/664911331
2019-12-13 08:35:27.008687 I | op-file: filesystem cephfs already exists
2019-12-13 08:35:27.008861 I | exec: Running command: ceph fs get cephfs --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/223099878
2019-12-13 08:35:27.184527 I | exec: Running command: ceph osd pool create rbdpool 0 replicated rbdpool --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/250019341
2019-12-13 08:35:27.990553 I | exec: Running command: ceph fs set cephfs max_mds 1 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/349404168
2019-12-13 08:35:29.130319 I | exec: Running command: ceph fs get cephfs --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/249192391
2019-12-13 08:35:29.175470 I | exec: pool 'rbdpool' created
2019-12-13 08:35:29.175661 I | exec: Running command: ceph osd pool set rbdpool size 2 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/885956218
2019-12-13 08:35:30.187313 I | exec: Running command: ceph fs set cephfs allow_standby_replay true --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/086719889
2019-12-13 08:35:31.166537 I | exec: set pool 3 size to 2
2019-12-13 08:35:31.167012 I | exec: Running command: ceph osd pool application enable rbdpool rbd --yes-i-really-mean-it --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/089825212
2019-12-13 08:35:31.174564 I | op-file: start running mdses for filesystem cephfs
2019-12-13 08:35:31.180518 I | op-mds: deployment for mds rook-ceph-mds-cephfs-a already exists. updating if needed
2019-12-13 08:35:31.182249 I | exec: Running command: ceph auth get-or-create-key mds.cephfs-a osd allow * mds allow mon allow profile mds --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/923400171
2019-12-13 08:35:32.299664 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/917187662
2019-12-13 08:35:33.003234 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:33.003250 I | op-k8sutil: updating deployment rook-ceph-mds-cephfs-a
2019-12-13 08:35:33.121768 I | exec: enabled application 'rbd' on pool 'rbdpool'
2019-12-13 08:35:33.121907 I | cephclient: creating replicated pool rbdpool succeeded, buf:
2019-12-13 08:35:33.121930 I | op-pool: created pool rbdpool
2019-12-13 08:35:41.028785 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mds-cephfs-a
2019-12-13 08:35:41.028798 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:41.033643 I | op-mds: deployment for mds rook-ceph-mds-cephfs-b already exists. updating if needed
2019-12-13 08:35:41.035465 I | exec: Running command: ceph auth get-or-create-key mds.cephfs-b osd allow * mds allow mon allow profile mds --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/321893205
2019-12-13 08:35:41.599590 I | exec: Running command: ceph versions --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/504814768
2019-12-13 08:35:42.281204 I | op-mon: this is not an upgrade, not performing upgrade checks
2019-12-13 08:35:42.281219 I | op-k8sutil: updating deployment rook-ceph-mds-cephfs-b
2019-12-13 08:35:54.311353 I | op-k8sutil: finished waiting for updated deployment rook-ceph-mds-cephfs-b
2019-12-13 08:35:54.311368 I | op-mon: this is not an upgrade, not performing upgrade checks

Environment:

OS (e.g. from /etc/os-release): Ubuntu 18.04.3 LTS
Kernel (e.g. uname -a): Linux n0200 5.0.0-36-generic #39~18.04.1-Ubuntu SMP Tue Nov 12 11:09:50 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration: vSphere
Rook version (use rook version inside of a Rook Pod): 1.1.7
Storage backend version (e.g. for ceph do ceph -v): 14.2.4
Kubernetes version (use kubectl version): 1.15.4
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): kubeadm
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_WARN 1 pools have too few placement groups; too few PGs per OSD (16 < min 30)

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 1
Comments: 36 (14 by maintainers)

Most upvoted comments

@joscollin: Applying

ceph osd pool application set cephfs-metadata cephfs metadata cephfs
ceph osd pool application set cephfs-data0 cephfs data cephfs

helped and CephFS Persistent Volume can now be provisioned. Thanks for your help!

I’m wondering if Rook lacks some checks when upgrading from older versions?

stephan2012 on Dec 17, 2019

We already run ceph osd pool application enable <pool name> cephfs --yes-i-really-mean-it but looks like it’s not enough. @joscollin is it still relevant? Or should we drop and replace with ceph osd pool application set cephfs-metadata cephfs metadata cephfs or have both?

Rook should not be setting the application on these pools; that is done for you by fs new. But it's really better if Rook lets fs new do all this for you.

@leseb I’d take the above statements from Patrick’s suggestion and make appropriate changes on the Rook side.

joscollin on Dec 18, 2019

@joscollin Can you summarize what is needed on the rook side for this? I thought rook was already calling new fs instead of setting the application name with the separate command.

travisn on Mar 31, 2020

@rajatsingh25aug You need to read the comments in this ticket again from the beginning. At-least from this comment onwards. Now Stephan’s and Markus’s issues are fixed using the workaround. But you need to do the actual code fix on the Rook side as discussed earlier with @leseb , so that this issue doesn’t come again.

joscollin on Jan 21, 2020

Currently I am having the same issue with 1.2.2 when I try to create a PVC:

kubectl -n rook-ceph logs csi-cephfsplugin-provisioner-565ffd64f5-8nt8w csi-cephfsplugin
...
E0120 16:13:13.605670       1 cephcmds.go:186] ID: 4091 failed getting omap value for key (csi.volume.pvc-b2def90c-2e6b-4588-9ac5-b091ed88fb1a) from omap (csi.volumes.default)
in pool (myfs-metadata): (an error (exit status 1) occurred while running rados args: [-m 10.41.54.36:6789,10.41.42.22:6789,10.41.46.232:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p myfs-metadata getomapval csi.volumes.default csi.volume.pvc-b2def90c-2e6b-4588-9ac5-b091ed88fb1a /tmp/omap-get-146467870 --namespace=csi])
E0120 16:13:13.605807       1 utils.go:123] ID: 4091 GRPC error: rpc error: code = Internal desc = error (an error (exit status 1) occurred while running rados args: [-m 10.41.54.36:6789,10.41.42.22:6789,10.41.46.232:6789 --id csi-cephfs-provisioner --keyfile=***stripped*** -c /etc/ceph/ceph.conf -p myfs-metadata getomapval csi.volumes.default csi.volume.pvc-b2def90c-2e6b-4588-9ac5-b091ed88fb1a /tmp/omap-get-146467870 --namespace=csi]) occurred, command output streams is ( error getting omap value myfs-metadata/csi.volumes.default/csi.volume.pvc-b2def90c-2e6b-4588-9ac5-b091ed88fb1a: (1) Operation not permitted
...

As far as I can see right now this helped but I need more testing tomorrow:

ceph osd pool application set myfs-metadata cephfs metadata cephfs
ceph osd pool application set myfs-data0 cephfs data cephfs

cultcom on Jan 20, 2020

@stephan2012 Could you please provide more information on @rajatsingh25aug query. Thanks.

joscollin on Jan 20, 2020