rook: External Ceph Cluster stuck in "Connecting"
- Bug Report
Expected behavior:
CephCluster state is “CONNECTED”
Actual behavior:
CephCluster state is “CONNECTING”
How to reproduce it (minimal and precise):
kubectl create -f common.yaml kubectl create -f operator.yaml kubectl create -f common-external.yaml kubectl create -f cluster-external.yaml bash cluster/examples/kubernetes/ceph/import-external-cluster.sh
File(s) to submit:
2019-09-24 02:09:00.811884 W | op-cluster: waiting for the connection info of the external cluster. retrying in 1m0s.
2019-09-24 02:10:00.816375 W | op-cluster: waiting for the connection info of the external cluster. retrying in 1m0s.
2019-09-24 02:11:00.819592 W | op-cluster: waiting for the connection info of the external cluster. retrying in 1m0s.
2019-09-24 02:12:00.823023 W | op-cluster: waiting for the connection info of the external cluster. retrying in 1m0s.
Environment:
- OS: RancherOS
- Kernel (e.g.
uname -a): 4.14.85-rancher - Cloud provider or hardware configuration: vSphere Virtual Machines
- Rook version (use
rook versioninside of a Rook Pod): v1.1.1 - Storage backend version (e.g. for ceph do
ceph -v): ceph version 12.2.12-48.el7cp (26388d73d88602005946d4381cc5796d42904858) luminous (stable) - Kubernetes version (use
kubectl version): v1.14.6 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Rancher managed
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox): HEALTH_OK
Other:
I have an external Ceph cluster and would like to use dynamic RBD volumes from Pod PVCs to a StorageClass with Rook, using Ceph CSI.
I think I have missed a step somewhere as I have been unable to get rook to connect to an external Ceph cluster. I’m not using any internal ceph cluster, so I followed documentation from
Of which I interpreted as;
- set the environment variables and run the shell script
- verify a configmap and secret has been created
- inject
common.yaml - inject
operator.yaml - inject
common-external.yaml - modify
cluster-external.yamlchanging the namespace to “rook-ceph”
At this stage, the Pods were in a crashbackoff loop, and I had to perform the following.
In the operator yaml I modified;
# RancherOS requires a different directory
- name: FLEXVOLUME_DIR_PATH
value: "/var/lib/kubelet/volumeplugins"
# RancherOS requires a specific value for the Kubelet
- name: ROOK_CSI_KUBELET_DIR_PATH
value: "/opt/rke/var/lib/kubelet"
I also had to modify the daemonset and change
- --containerized=false
With this in place, I have the CephCluster state stuck in “CONNECTING”
I can see the following error when I look at the csi-rbdplugin container logs within the operator pod
E0924 02:44:39.175178 1 utils.go:123] ID: 25 GRPC error: rpc error: code = InvalidArgument desc = failed to fetch monitor list using clusterID (rook-ceph): missing configuration for cluster ID (rook-ceph)
So where should the Ceph CSI RBD plugin be reading this information from?
If I look at the following documentation;
Seems to indicate a configmap is required with the clusterid and monitors, is the Rook Operator supposed to create this from the information provided in the secret and configmap from the bash script?
The samples given within the Ceph CSI repo indicates there is another configMap
I’ve definitely missed something here but I can’t figure it out at this point.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 27 (11 by maintainers)
Getting a similar issue here.
Procedure followed:
Results:
I have got same error.
Environment:
I just want use rook to connect my externl ceph cluster, so no internal ceph.
How to reproduce it (minimal and precise):
cluster-external.yaml:
storageclass.yaml(change pool name to ‘rbd’, which is the pool name of my external ceph pool):
The csi-cluster-convig-json in cm/rook-ceph-csi-config are empty.
Part of error Log:
Find some error in ceph-operator:
Does this mean that rook 1.1.1 don’t support my externel ceph luminous?
Because RHCOS node don’t support StorageClass with the “kubernetes.io/rbd” provisioner. So, rook is my only solution.
I don’t know the reason of failure are mis-config or rook don’t support ceph 12.2.12?
I should also add that I am able to use a StorageClass with the “kubernetes.io/rbd” provisioner successfully, so it would seem to be some config I am missing on the Rook / Ceph CSI configuration side.