rook: op-cluster: Failed to configure external ceph cluster

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior: Yes, operator does not succeed in configuring external Ceph cluster.

Expected behavior: Succeed in configuring external Ceph cluster.

How to reproduce it (minimal and precise):

Tested using minikube 1.11:

kubectl apply -f rook/cluster/examples/kubernetes/ceph/common.yaml
kubectl apply -f rook/cluster/examples/kubernetes/ceph/operator.yaml
export ROOK_EXTERNAL_ADMIN_SECRET=...
export ROOK_EXTERNAL_CEPH_MON_DATA=...
export ROOK_EXTERNAL_FSID=...
export NAMESPACE=rook-ceph
bash import-external-cluster.sh
kubectl apply -f cluster-external.yaml (see below)

File(s) to submit: None

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  external:
    enable: true
  crashCollector:
    disable: true
  • Operator’s logs, if necessary
op-cluster: failed to configure external ceph cluster. failed to create csi kubernetes secrets: failed to create kubernetes csi secret: failed to create kubernetes secret map["adminID":"csi-cephfs-provisioner" "adminKey":"AQAk5vheLS6aABAAM5660HBPVNb1GT/2FqGmaw=="] for cluster "rook-ceph": failed to update secret for rook-csi-cephfs-provisioner: Secret "rook-csi-cephfs-provisioner" is invalid: type: Invalid value: "kubernetes.io/rook": field is immutable

similar for other types of secret: rook-csi-ceph-rbd-provisioner, … Secrets are available in the namespace.

  • Crashing pod(s) logs, if necessary

Environment: minikube 1.11, also tested on Rancher and k3s.

  • OS (e.g. from /etc/os-release):

  • Kernel (e.g. uname -a): Linux minikube 4.19.107 #1 SMP Thu May 28 15:07:17 PDT 2020 x86_64 GNU/Linux

  • Cloud provider or hardware configuration: N/A

  • Rook version (use rook version inside of a Rook Pod): rook: v1.3.6 go: go1.13.8

  • Storage backend version (e.g. for ceph do ceph -v): ceph version 14.2.9 (bed944f8c45b9c98485e99b70e11bbcec6f6659a) nautilus (stable)

  • Kubernetes version (use kubectl version): Server Version: version.Info{Major:“1”, Minor:“18”, GitVersion:“v1.18.3”, GitCommit:“2e7996e3e2712684bc73f0dec0200d64eec7fe40”, GitTreeState:“clean”, BuildDate:“2020-05-20T12:43:34Z”, GoVersion:“go1.13.9”, Compiler:“gc”, Platform:“linux/amd64”}

  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): N/A

  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox): HEALTH_WARN OSD count 2 < osd_pool_default_size 3 (local test cluster, not recommended size)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 24 (6 by maintainers)

Commits related to this issue

Most upvoted comments

I might have found a solution @cypher0n3

You need to edit the rook-ceph-mon secret and remove the empty fields like username using:

kubectl edit secrets rook-ceph-mon -o yaml

and also edit the import script to remove rook-csi-cephfs-node, rook-csi-cephfs-provisioner, rook-csi-rbd-node, rook-csi-rbd-provisioner secrets.

Mine connected after doing that.

@leolzbing521 I have had more luck with https://github.com/ceph/ceph-csi, which is obviously not the right solution if you want Rook, and want the features that Rook offers, but for our use case it was sufficient enough.

However I would like to see this issue fixed too 😃.

I see,But I realy want see the issue fixed.

Ran into this again with rook 1.4 version.

2021-01-06 12:44:00.791503 E | cephclient: ceph username is empty

This time edited and removed first ceph.username field. But this was after apply the cluster-external.yaml

Then I got this error

2021-01-07 10:20:47.882355 E | cephclient: ceph secret is empty

Workaround

730  k apply -f common-external.yaml 
731  bash import-external-cluster.sh 
732  alias kre="kubectl -n rook-ceph-external"
733  kre get secrets
734  kre delete secrets rook-csi-cephfs-node rook-csi-cephfs-provisioner rook-csi-rbd-node rook-csi-rbd-provisioner
735  kre get secrets
736  kre edit secret rook-ceph-mon

remove the empty fields ceph-secret and ceph-username: (do this before applying cluster-external.yaml)

apiVersion: v1
data:
  admin-secret: aMXc9PQ==
  ceph-secret: ""
  ceph-username: ""
  cluster-name: LWV4dGVybmFs
  fsid: tYmEyMzNhYTFkYjU1
  mon-secret:dA==
kind: Secret
metadata:
  creationTimestamp: "2021-01-08T05:24:47Z"
  name: rook-ceph-mon
  namespace: rook-ceph-external
  resourceVersion: "15376111"
  selfLink: /api/v1/namespaces/rook-ceph-external/secrets/rook-ceph-mon
  uid: a007b689-a8db-4fd6-a325-1eb61d204e12
type: Opaque

apply cluster-external

k create -f cluster-external.yaml 
kre get cephcluster

root@green-1:~/rook/cluster/examples/kubernetes/ceph# kre get cephcluster
NAME                 DATADIRHOSTPATH   MONCOUNT   AGE    PHASE       MESSAGE                          HEALTH
rook-ceph-external                                8m2s   Connected   Cluster connected successfully   HEALTH_WARN

I ran into the same or some similar issue and found this bug thanks to a hint in the slack channel.

@MicDeDuiwel’s suggestion resolved my issue - at least the cluster is now connected. Testing of RBD/CephFS/Radosgw are pending.

What would be the best way to fix the issue? Mention this issue in the documentation (probably not)? Introduce a new variant of the bash-script import-external-cluster-management.sh without the rbd and cephfs secrets and describe that in the documentation? Or should the operator be fixed to handle empty keys in the monitor secret (I don’t speak go, so I don’t know if I would be up to that task)? Or something else?

I ran in the same issues and following the steps mentioned here made it work for me.

Thanks @MicDeDuiwel ! That worked for me as well

Fellow delete \rook-csi-cephfs-node \rook-csi-cephfs-provisioner \ rook-csi-rbd-node \rook-csi-rbd-provisioner
The operator recreated Is work