rook: Ceph Mirroring: Cannot read conf file, Permission Denied
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior:
Executing the following does not produce an error at the CLI.
[root@ip-172-20-36-253 ceph]# rbd mirror pool peer add replicapool client.admin@remote f9f5a20a-038c-48ae-96e1-5090d2cf276b
However, kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1 shows
debug 2019-08-06 23:12:04.894 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:12:34.898 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:12:34.926 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:13:04.930 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:13:04.962 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:13:34.966 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:13:34.994 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:14:05.002 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:14:05.026 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
The remote files have the same perms as the ceph.conf and keyring files
[root@ip-172-20-36-253 ceph]# ls -la
total 28
drwxr-xr-x 1 root root 4096 Aug 6 23:02 .
drwxr-xr-x 1 root root 4096 Aug 6 21:58 ..
-rw-r--r-- 1 root root 121 Aug 6 21:58 ceph.conf
-rw-r--r-- 1 root root 62 Aug 6 21:58 keyring
-rw-r--r-- 1 root root 92 Jul 17 16:07 rbdmap
-rw-r--r-- 1 root root 62 Aug 6 23:01 remote.client.admin.keyring
-rw-r--r-- 1 root root 299 Aug 6 23:02 remote.conf
Expected behavior:
The mirroring peer is added without permissions errors in the log.
How to reproduce it (minimal and precise):
The behavior has been reproduced in AWS on K8S-1.13 and OCP-v4.0.
- Use either kops or openshift-install to deploy two 1-master, 3-node clusters
- Create and attach block volumes, 1 per node
kubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/common.yaml -f github.com/rook/rook/cluster/examples/kubernetes/ceph/operator.yamlkubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/cluster.yaml# enable 1 rbd work under prior to creationkubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/pool.yamlkubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/toolbox.yaml- exec into the toolbox pod on the minion cluster and cat the key from /etc/ceph/keyring
- exec into the toolbox pod on the master cluster and create
/etc/ceph/remote.confand/etc/ceph/remote.client.admin.keyring - edit
remote.conf, add the minion clusters’ mons’ URLs as mon hosts - edit
remote.client.admin.keyringto add key retrieved from minion cluster - enable mirroring, mode: pool on both ceph clusters
- on the master cluster, exec
rbd mirror pool peer add replicapool client.admin@remote - get the logs of the rbd worker on the master cluster (error takes a few seconds to being appearing)
kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1
File(s) to submit:
cluster.yaml
#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.
# For example, to create the cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
# The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
# v12 is luminous, v13 is mimic, and v14 is nautilus.
# RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different
# versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
image: ceph/ceph:v14.2.2-20190722
# Whether to allow unsupported versions of Ceph. Currently luminous, mimic and nautilus are supported, with the recommendation to upgrade to nautilus.
# Do not set to true in production.
allowUnsupported: false
# The path on the host where configuration files will be persisted. Must be specified.
# Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
# In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
dataDirHostPath: /var/lib/rook
# set the amount of mons to be started
mon:
count: 3
allowMultiplePerNode: false
# enable the ceph dashboard for viewing cluster status
dashboard:
enabled: true
# serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
# urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
# ssl: true
# enable prometheus alerting for cluster
monitoring:
# requires Prometheus to be pre-installed
enabled: false
# namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
# Recommended:
# If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespace: rook-ceph
network:
# toggle to use hostNetwork
hostNetwork: false
rbdMirroring:
# The number of daemons that will perform the rbd mirroring.
# rbd mirroring must be configured with "rbd mirror" from the rook toolbox.
workers: 3
# To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
# The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
# tolerate taints with a key of 'storage-node'.
# placement:
# all:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: role
# operator: In
# values:
# - storage-node
# podAffinity:
# podAntiAffinity:
# tolerations:
# - key: storage-node
# operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
# mon:
# osd:
# mgr:
annotations:
# all:
# mon:
# osd:
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
# mgr:
resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
# mgr:
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
# mon:
# osd:
storage: # cluster level storage configuration and selection
useAllNodes: true
useAllDevices: true
deviceFilter: xvd*
location:
config:
# The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
# Set the storeType explicitly only if it is required not to use the default.
# storeType: bluestore
# metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
# databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
# journalSizeMB: "1024" # uncomment if the disks are 20 GB or smaller
# osdsPerDevice: "1" # this value can be overridden at the node or device level
# encryptedDevice: "true" # the default value for this option is "false"
# Cluster level list of directories to use for filestore-based OSD storage. If uncommented, this example would create an OSD under the dataDirHostPath.
#directories:
#- path: /var/lib/rook
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources. Each node's 'name' field should match their 'kubernetes.io/hostname' label.
# nodes:
# - name: "172.17.4.101"
# directories: # specific directories to use for storage can be specified for each node
# - path: "/rook/storage-dir"
# resources:
# limits:
# cpu: "500m"
# memory: "1024Mi"
# requests:
# cpu: "500m"
# memory: "1024Mi"
# - name: "172.17.4.201"
# devices: # specific devices to use for storage can be specified for each node
# - name: "sdb"
# - name: "nvme01" # multiple osds can be created on high performance devices
# config:
# osdsPerDevice: "5"
# config: # configuration can be specified at the node level which overrides the cluster level config
# storeType: filestore
# - name: "172.17.4.301"
# deviceFilter: "^sd."
- rbd-mirror pod logs
$ kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1
debug 2019-08-06 21:58:33.744 7f92c5274fc0 0 set uid:gid to 167:167 (ceph:ceph)
debug 2019-08-06 21:58:33.744 7f92c5274fc0 0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process rbd-mirror, pid 1
debug 2019-08-06 21:58:33.776 7f92c5274fc0 1 mgrc service_daemon_register rbd-mirror.14206 metadata {arch=x86_64,ceph_release=nautilus,ceph_version=ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable),ceph_version_short=14.2.2,container_hostna
me=rook-ceph-rbd-mirror-a-7b465b89cc-zsj22,container_image=ceph/ceph:v14.2.2-20190722,cpu=Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz,distro=centos,distro_description=CentOS Linux 7 (Core),distro_version=7,hostname=ip-172-20-36-253.ec2.internal,id=a,instance_id=14206,kernel
_description=#1 SMP Debian 4.9.168-1+deb9u3 (2019-06-16),kernel_version=4.9.0-9-amd64,mem_swap_kb=0,mem_total_kb=4049096,os=Linux,pod_name=rook-ceph-rbd-mirror-a-7b465b89cc-zsj22,pod_namespace=rook-ceph}
debug 2019-08-06 23:03:04.313 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:03:34.317 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:03:34.341 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:04:04.345 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:04:04.377 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:04:34.381 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:04:34.409 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:05:04.413 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:05:04.441 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:05:34.445 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:05:34.473 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:06:04.477 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:06:04.505 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:06:34.509 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:06:34.537 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
Environment:
- Cloud provider or hardware configuration: AWS automated deployment, test on both OCP4.0 and K8s-1.13
- Rook version (use
rook versioninside of a Rook Pod): rook: v1.0.0-335.g1f8bd1f - Storage backend version (e.g. for ceph do
ceph -v): ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable) - Kubernetes version (use
kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T16:54:35Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
- Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): K8s and OCP, hosted on AWS instances
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox):
[root@ip-172-20-43-121 /]# ceph health
HEALTH_OK
Additionally, I can confirm that the remote mons are reachable from the master cluster’s mons:
[root@ip-172-20-36-253 ceph]# curl -v a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com:6789
* About to connect() to a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com port 6789 (#0)
* Trying 34.194.127.186...
* Connected to a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com (34.194.127.186) port 6789 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com:6789
> Accept: */*
>
ceph v027dDz
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (7 by maintainers)
@travisn @shaekhhasanshoron
It’s been a little bit, but when I did test this out I documented my steps as a reference point, below is the
manualapproach I did (not using bootstrap):of course, all of this depends on net connectivity between your clusters - so make sure your mon pods can communicate with each other over port 6789