rook: Ceph Mirroring: Cannot read conf file, Permission Denied

Is this a bug report or feature request?

  • Bug Report

Deviation from expected behavior:

Executing the following does not produce an error at the CLI. [root@ip-172-20-36-253 ceph]# rbd mirror pool peer add replicapool client.admin@remote f9f5a20a-038c-48ae-96e1-5090d2cf276b

However, kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1 shows

debug 2019-08-06 23:12:04.894 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:12:34.898 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:12:34.926 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:13:04.930 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:13:04.962 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:13:34.966 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:13:34.994 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:14:05.002 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:14:05.026 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied

The remote files have the same perms as the ceph.conf and keyring files

[root@ip-172-20-36-253 ceph]# ls -la
total 28
drwxr-xr-x 1 root root 4096 Aug  6 23:02 .
drwxr-xr-x 1 root root 4096 Aug  6 21:58 ..
-rw-r--r-- 1 root root  121 Aug  6 21:58 ceph.conf
-rw-r--r-- 1 root root   62 Aug  6 21:58 keyring
-rw-r--r-- 1 root root   92 Jul 17 16:07 rbdmap
-rw-r--r-- 1 root root   62 Aug  6 23:01 remote.client.admin.keyring
-rw-r--r-- 1 root root  299 Aug  6 23:02 remote.conf

Expected behavior:

The mirroring peer is added without permissions errors in the log.

How to reproduce it (minimal and precise):

The behavior has been reproduced in AWS on K8S-1.13 and OCP-v4.0.

  1. Use either kops or openshift-install to deploy two 1-master, 3-node clusters
  2. Create and attach block volumes, 1 per node
  3. kubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/common.yaml -f github.com/rook/rook/cluster/examples/kubernetes/ceph/operator.yaml
  4. kubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/cluster.yaml # enable 1 rbd work under prior to creation
  5. kubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/pool.yaml
  6. kubectl create -f github.com/rook/rook/cluster/examples/kubernetes/ceph/toolbox.yaml
  7. exec into the toolbox pod on the minion cluster and cat the key from /etc/ceph/keyring
  8. exec into the toolbox pod on the master cluster and create /etc/ceph/remote.conf and /etc/ceph/remote.client.admin.keyring
  9. edit remote.conf, add the minion clusters’ mons’ URLs as mon hosts
  10. edit remote.client.admin.keyring to add key retrieved from minion cluster
  11. enable mirroring, mode: pool on both ceph clusters
  12. on the master cluster, exec rbd mirror pool peer add replicapool client.admin@remote
  13. get the logs of the rbd worker on the master cluster (error takes a few seconds to being appearing) kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1

File(s) to submit:

cluster.yaml

#################################################################################################################
# Define the settings for the rook-ceph cluster with common settings for a production cluster.
# All nodes with available raw devices will be used for the Ceph cluster. At least three nodes are required
# in this example. See the documentation for more details on storage settings available.

# For example, to create the cluster:
#   kubectl create -f common.yaml
#   kubectl create -f operator.yaml
#   kubectl create -f cluster.yaml
#################################################################################################################

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # The container image used to launch the Ceph daemon pods (mon, mgr, osd, mds, rgw).
    # v12 is luminous, v13 is mimic, and v14 is nautilus.
    # RECOMMENDATION: In production, use a specific version tag instead of the general v14 flag, which pulls the latest release and could result in different
    # versions running within the cluster. See tags available at https://hub.docker.com/r/ceph/ceph/tags/.
    image: ceph/ceph:v14.2.2-20190722
    # Whether to allow unsupported versions of Ceph. Currently luminous, mimic and nautilus are supported, with the recommendation to upgrade to nautilus.
    # Do not set to true in production.
    allowUnsupported: false
  # The path on the host where configuration files will be persisted. Must be specified.
  # Important: if you reinstall the cluster, make sure you delete this directory from each host or else the mons will fail to start on the new cluster.
  # In Minikube, the '/data' directory is configured to persist across reboots. Use "/data/rook" in Minikube environment.
  dataDirHostPath: /var/lib/rook
  # set the amount of mons to be started
  mon:
    count: 3
    allowMultiplePerNode: false
  # enable the ceph dashboard for viewing cluster status
  dashboard:
    enabled: true
    # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
    # urlPrefix: /ceph-dashboard
    # serve the dashboard at the given port.
    # port: 8443
    # serve the dashboard using SSL
    # ssl: true
  # enable prometheus alerting for cluster
  monitoring:
    # requires Prometheus to be pre-installed
    enabled: false
    # namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
    # Recommended:
    # If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
    # If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
    # deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
    rulesNamespace: rook-ceph
  network:
    # toggle to use hostNetwork
    hostNetwork: false
  rbdMirroring:
    # The number of daemons that will perform the rbd mirroring.
    # rbd mirroring must be configured with "rbd mirror" from the rook toolbox.
    workers: 3
  # To control where various services will be scheduled by kubernetes, use the placement configuration sections below.
  # The example under 'all' would have all services scheduled on kubernetes nodes labeled with 'role=storage-node' and
  # tolerate taints with a key of 'storage-node'.
#  placement:
#    all:
#      nodeAffinity:
#        requiredDuringSchedulingIgnoredDuringExecution:
#          nodeSelectorTerms:
#          - matchExpressions:
#            - key: role
#              operator: In
#              values:
#              - storage-node
#      podAffinity:
#      podAntiAffinity:
#      tolerations:
#      - key: storage-node
#        operator: Exists
# The above placement information can also be specified for mon, osd, and mgr components
#    mon:
#    osd:
#    mgr:
  annotations:
#    all:
#    mon:
#    osd:
# If no mgr annotations are set, prometheus scrape annotations will be set by default.
#   mgr:
  resources:
# The requests and limits set here, allow the mgr pod to use half of one CPU core and 1 gigabyte of memory
#    mgr:
#      limits:
#        cpu: "500m"
#        memory: "1024Mi"
#      requests:
#        cpu: "500m"
#        memory: "1024Mi"
# The above example requests/limits can also be added to the mon and osd components
#    mon:
#    osd:
  storage: # cluster level storage configuration and selection
    useAllNodes: true
    useAllDevices: true
    deviceFilter: xvd*
    location:
    config:
      # The default and recommended storeType is dynamically set to bluestore for devices and filestore for directories.
      # Set the storeType explicitly only if it is required not to use the default.
      # storeType: bluestore
      # metadataDevice: "md0" # specify a non-rotational storage so ceph-volume will use it as block db device of bluestore.
      # databaseSizeMB: "1024" # uncomment if the disks are smaller than 100 GB
      # journalSizeMB: "1024"  # uncomment if the disks are 20 GB or smaller
      # osdsPerDevice: "1" # this value can be overridden at the node or device level
      # encryptedDevice: "true" # the default value for this option is "false"
# Cluster level list of directories to use for filestore-based OSD storage. If uncommented, this example would create an OSD under the dataDirHostPath.
    #directories:
    #- path: /var/lib/rook
# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named
# nodes below will be used as storage resources.  Each node's 'name' field should match their 'kubernetes.io/hostname' label.
#    nodes:
#    - name: "172.17.4.101"
#      directories: # specific directories to use for storage can be specified for each node
#      - path: "/rook/storage-dir"
#      resources:
#        limits:
#          cpu: "500m"
#          memory: "1024Mi"
#        requests:
#          cpu: "500m"
#          memory: "1024Mi"
#    - name: "172.17.4.201"
#      devices: # specific devices to use for storage can be specified for each node
#      - name: "sdb"
#      - name: "nvme01" # multiple osds can be created on high performance devices
#        config:
#          osdsPerDevice: "5"
#      config: # configuration can be specified at the node level which overrides the cluster level config
#        storeType: filestore
#    - name: "172.17.4.301"
#      deviceFilter: "^sd."
  • rbd-mirror pod logs
$ kubectl logs -f -n rook-ceph rook-ceph-rbd-mirror-a-7b465b89cc-zsj22 --context=c1
debug 2019-08-06 21:58:33.744 7f92c5274fc0  0 set uid:gid to 167:167 (ceph:ceph)
debug 2019-08-06 21:58:33.744 7f92c5274fc0  0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process rbd-mirror, pid 1
debug 2019-08-06 21:58:33.776 7f92c5274fc0  1 mgrc service_daemon_register rbd-mirror.14206 metadata {arch=x86_64,ceph_release=nautilus,ceph_version=ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable),ceph_version_short=14.2.2,container_hostna
me=rook-ceph-rbd-mirror-a-7b465b89cc-zsj22,container_image=ceph/ceph:v14.2.2-20190722,cpu=Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz,distro=centos,distro_description=CentOS Linux 7 (Core),distro_version=7,hostname=ip-172-20-36-253.ec2.internal,id=a,instance_id=14206,kernel
_description=#1 SMP Debian 4.9.168-1+deb9u3 (2019-06-16),kernel_version=4.9.0-9-amd64,mem_swap_kb=0,mem_total_kb=4049096,os=Linux,pod_name=rook-ceph-rbd-mirror-a-7b465b89cc-zsj22,pod_namespace=rook-ceph}
debug 2019-08-06 23:03:04.313 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:03:34.317 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:03:34.341 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:04:04.345 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:04:04.377 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:04:34.381 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:04:34.409 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:05:04.413 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:05:04.441 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:05:34.445 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:05:34.473 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:06:04.477 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:06:04.505 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied
debug 2019-08-06 23:06:34.509 7f92c5274fc0 -1 rbd::mirror::Mirror: 0x5630957d9d40 update_pool_replayers: restarting failed pool replayer for uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin
debug 2019-08-06 23:06:34.537 7f92c5274fc0 -1 rbd::mirror::PoolReplayer: 0x563095896190 init_rados: could not read ceph conf for remote peer uuid: f9f5a20a-038c-48ae-96e1-5090d2cf276b cluster: remote client: client.admin: (13) Permission denied

Environment:

  • Cloud provider or hardware configuration: AWS automated deployment, test on both OCP4.0 and K8s-1.13
  • Rook version (use rook version inside of a Rook Pod): rook: v1.0.0-335.g1f8bd1f
  • Storage backend version (e.g. for ceph do ceph -v): ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable)
  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T16:54:35Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): K8s and OCP, hosted on AWS instances
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
[root@ip-172-20-43-121 /]# ceph health
HEALTH_OK

Additionally, I can confirm that the remote mons are reachable from the master cluster’s mons:

[root@ip-172-20-36-253 ceph]# curl -v a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com:6789
* About to connect() to a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com port 6789 (#0)
*   Trying 34.194.127.186...
* Connected to a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com (34.194.127.186) port 6789 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: a0f5d548fb89c11e9adc30ebfca9798d-708516057.us-east-1.elb.amazonaws.com:6789
> Accept: */*
>
ceph v027dDz

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (7 by maintainers)

Most upvoted comments

@travisn @shaekhhasanshoron

It’s been a little bit, but when I did test this out I documented my steps as a reference point, below is the manual approach I did (not using bootstrap):

of course, all of this depends on net connectivity between your clusters - so make sure your mon pods can communicate with each other over port 6789

MANUAL METHOD:

From the toolbox on Cluster 1 and Cluster 2 - create a mirror client user that will have access to the local ceph mirror daemon.

-- Cluster 1
ceph auth get-or-create client.rbd-mirror-peer mon 'profile rbd' osd 'profile rbd'
[client.rbd-mirror-peer]
	key = AQB274xd+f1tKRAAdefmahPVEh+Fkb5e8OcKDw==

-- Cluster 2
ceph auth get-or-create client.rbd-mirror-peer mon 'profile rbd' osd 'profile rbd'
[client.rbd-mirror-peer]
	key = AQBm74xdZLi9MBAA9CYTtOgF5Roz+4p5jlJ7dQ==




From cluster 1 create the remote key file based on the remote key from above (i.e. on cluster1 use the key from cluster2, etc…)

cat <<EOF > remote-key-file
AQBm74xdZLi9MBAA9CYTtOgF5Roz+4p5jlJ7dQ==
EOF


From cluster 1 toolbox pod run the peer add command using the options and remote mons

$ rbd --cluster ceph mirror pool peer add testpool client.rbd-mirror-peer@remote --remote-mon-host 172.21.55.79,172.21.57.75 --remote-key-file remote-key-file

	$rbd --cluster ceph mirror pool info testpool --all

Mode: pool
Peers: 
  UUID                                 NAME   CLIENT       MON_HOST                   KEY                                      
  9496b620-db69-4d82-8306-3065b070611b remote client.rbd-mirror-peer 172.23.44.230,172.23.50.32 AQBm74xdZLi9MBAA9CYTtOgF5Roz+4p5jlJ7dQ== 

 



From cluster 2 create your mirror pool peer as well

$ rbd --cluster ceph mirror pool peer add testpool client.rbd-mirror-peer@remote --remote-mon-host 172.20.32.61,172.20.47.202 --remote-key-file remote-key-file

	$rbd --cluster ceph mirror pool info testpool --all

Mode: pool
Peers: 
  UUID                                 NAME   CLIENT       MON_HOST                   KEY                                      
  9496b620-db69-4d82-8306-3065b070611b remote client.rbd-mirror-peer 172.23.44.230,172.23.50.32 AQBm74xdZLi9MBAA9CYTtOgF5Roz+4p5jlJ7dQ== 

Create and Enable image journaling

--- on both clusters switch to image mode
rbd --cluster ceph mirror pool enable testpool image

--- on cluster 1 create the image with the journaling and lock feature enabled
rbd create testc1 --size 10 --pool testpool --image-feature exclusive-lock,journaling

--- on cluster 1 enable the image for mirroring
rbd mirror image enable testpool/testc1

--- on cluster 2 - see if our image has arrived?
rbd ls testpool
testc1