rook: error provisioning bucket: Provision: can't create ceph user: error creating ceph user

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: bucket creation fails because user creation failure. The failure seems related to a username being empty. changing from:

radosgw-admin user create --uid ceph-user-ZDWhL1M1 --display-name ceph-user-ZDWhL1M1 --rgw-realm=arch-cloud --rgw-zonegroup=arch-cloud --cluster=kube-system --conf=/var/lib/rook/kube-system/kube-system.config --name= --keyring=/
var/lib/rook/kube-system/.keyring    

to

radosgw-admin user create --uid ceph-user-ZDWhL1M1 --display-name ceph-user-ZDWhL1M1 --rgw-realm=arch-cloud --rgw-zonegroup=arch-cloud --cluster=kube-system --conf=/var/lib/rook/kube-system/kube-system.config --name=client.admin --keyring=/
var/lib/rook/kube-system/client.admin.keyring

On the operator pods, I’m able to create user manually.

Looking at pkg/operator/ceph/object/admin.go and pkg/daemon/ceph/client/command.go on v1.3.3, My go-skills don’t allow to understand how a context could created with an emtpy value of RunAsUser, despite being a const of ‘client.admin’.

Expected behavior: Should work ?

How to reproduce it (minimal and precise): In a similar setup to https://github.com/rook/rook/issues/5439 but:

  • upgrading to v1.3.3
  • moving to kube-system as I originally wanted
  • fixing the bug by creating rook-ceph-config secret manualy
  • setting ROOK_LOG_LEVEL to DEBUG, to display run commands.
$ kubectl -n kube-system get pods -l app=rook-ceph-operator -o yaml | grep rook/ceph
      image: rook/ceph:v1.3.3
      image: docker.io/rook/ceph:v1.3.3
      imageID: docker.io/rook/ceph@sha256:09e53d535bb711c5f433a3a05ea327087193227b14089319f20848f3be1fc70c
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: arch-cloud
  namespace: kube-system
spec:
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
  dataPool:
    failureDomain: host
    replicated:
      size: 3
  preservePoolsOnDelete: true
  gateway:
    type: s3
    sslCertificateRef:
    port: 80
    securePort:
    instances: 1
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: s3
provisioner: kube-system.ceph.rook.io/bucket
reclaimPolicy: Delete
parameters:
  objectStoreName: arch-cloud
  objectStoreNamespace: kube-system
  region: us-east-1
I0510 15:47:46.221125       7 controller.go:203]  "msg"="reconciling claim" "key"="gitlab/gitlab-artifacts"                                 
I0510 15:47:46.221168       7 helpers.go:80]  "msg"="getting claim for key" "key"="gitlab/gitlab-artifacts"                                 
I0510 15:47:46.226033       7 helpers.go:181]  "msg"="getting ObjectBucketClaim's StorageClass" "key"="gitlab/gitlab-artifacts"             
I0510 15:47:46.229996       7 helpers.go:186]  "msg"="got StorageClass" "key"="gitlab/gitlab-artifacts" "name"="s3"                         
I0510 15:47:46.230048       7 helpers.go:63]  "msg"="checking OBC for OB name, this indicates provisioning is complete" "key"="gitlab/gitlab-artifacts" "gitlab-artifacts"=null
I0510 15:47:46.230078       7 resourcehandlers.go:284]  "msg"="updating status:" "key"="gitlab/gitlab-artifacts" "new status"="Pending" "obc"="gitlab/gitlab-artifacts" "old status"="Pending"
I0510 15:47:46.237383       7 controller.go:265]  "msg"="syncing obc creation" "key"="gitlab/gitlab-artifacts"                              
I0510 15:47:46.237415       7 controller.go:544]  "msg"="getting OBC to set metadata fields" "key"="gitlab/gitlab-artifacts"                
I0510 15:47:46.241193       7 controller.go:553]  "msg"="updating OBC metadata" "key"="gitlab/gitlab-artifacts"                             
I0510 15:47:46.241231       7 resourcehandlers.go:275]  "msg"="updating" "key"="gitlab/gitlab-artifacts" "obc"="gitlab/gitlab-artifacts"    
I0510 15:47:46.248048       7 helpers.go:80]  "msg"="getting claim for key" "key"="gitlab/gitlab-artifacts"                                 
I0510 15:47:46.258666       7 controller.go:337]  "msg"="provisioning" "key"="gitlab/gitlab-artifacts" "bucket"="gitlab-artifacts"          
2020-05-10 15:47:46.258691 I | op-bucket-prov: initializing and setting CreateOrGrant services                                              
2020-05-10 15:47:46.258709 I | op-bucket-prov: getting storage class "s3"                                                                   
2020-05-10 15:47:46.335650 I | op-bucket-prov: Provision: creating bucket "gitlab-artifacts" for OBC "gitlab-artifacts"                     
2020-05-10 15:47:46.335712 I | ceph-object-controller: Getting user: ceph-user-ZDWhL1M1                                                     
2020-05-10 15:47:46.335736 D | exec: Running command: radosgw-admin user info --uid ceph-user-ZDWhL1M1 --rgw-realm=arch-cloud --rgw-zonegroup=arch-cloud --cluster=kube-system --conf=/var/lib/rook/kube-system/kube-system.config --name= --keyring=/var/lib/rook/kube-system/.keyring
2020-05-10 15:47:46.352256 I | op-bucket-prov: creating Ceph user "ceph-user-ZDWhL1M1"                                                      
2020-05-10 15:47:46.352286 D | ceph-object-controller: Creating user: ceph-user-ZDWhL1M1                                                    
2020-05-10 15:47:46.352310 D | exec: Running command: radosgw-admin user create --uid ceph-user-ZDWhL1M1 --display-name ceph-user-ZDWhL1M1 --rgw-realm=arch-cloud --rgw-zonegroup=arch-cloud --cluster=kube-system --conf=/var/lib/rook/kube-system/kube-system.config --name= --keyring=/
var/lib/rook/kube-system/.keyring                                                                                                           
E0510 15:47:46.365745       7 controller.go:190] error syncing 'gitlab/gitlab-artifacts': error provisioning bucket: Provision: can't create ceph user: error creating ceph user "ceph-user-ZDWhL1M1": failed to create user: failed to run radosgw-admin: exit status 1: failed to create
 user: failed to run radosgw-admin: exit status 1, requeuing                                                                                

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 29 (12 by maintainers)

Most upvoted comments

@leseb Tested: Reinstalling my cluster, adding some partition available on the the node and running an managed ceph cluster, user can be created, and I got my secret and configmap in the namespace. This bug seems really tied to the external cluster feature.

I run a kubernetes cluster (without any rook/ceph inside) consuming an external cluster. It’s purpose is to be accessed only by the kubernetes, so I would like to use all the management features offered by rook. Is that answering the question ?

Yes, thanks.

EDIT: I should also note that the hosts are the same, but as I run my custom kubernetes installer, I don’t want to completely broke my ceph cluster when I screw up the kubernetes cluster. So logically that two distinct clusters (managed by ansible)

No worries, no need to break anything.

I’m still trying to figure out why the username gets lost.