rook: Crash loopback off error in OSD pods

I am trying to run rook-ceph in my AKS cluster but my OSD pods are having status crash loop back error. I have cloned the repo from https://github.com/rook/rook.git Common.yaml file is the same. Here is my operator.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rook-ceph-operator
  namespace: rook-ceph
  labels:
    operator: rook
    storage-backend: ceph
spec:
  selector:
    matchLabels:
      app: rook-ceph-operator
  replicas: 1
  template:
    metadata:
      labels:
        app: rook-ceph-operator
    spec:
      serviceAccountName: rook-ceph-system
      containers:
      - name: rook-ceph-operator
        image: rook/ceph:master
        args: ["ceph", "operator"]
        volumeMounts:
        - mountPath: /var/lib/rook
          name: rook-config
        - mountPath: /etc/ceph
          name: default-config-dir
        env:
        - name: ROOK_CURRENT_NAMESPACE_ONLY
          value: "false"
        - name: FLEXVOLUME_DIR_PATH
          value: "/etc/kubernetes/volumeplugins"
        - name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
          value: "false"
        - name: ROOK_LOG_LEVEL
          value: "INFO"
        - name: ROOK_CEPH_STATUS_CHECK_INTERVAL
          value: "60s"
        - name: ROOK_MON_HEALTHCHECK_INTERVAL
          value: "45s"
        - name: ROOK_MON_OUT_TIMEOUT
          value: "600s"
        - name: ROOK_DISCOVER_DEVICES_INTERVAL
          value: "60m"
        - name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
          value: "false"
        - name: ROOK_ENABLE_SELINUX_RELABELING
          value: "true"
        - name: ROOK_ENABLE_FSGROUP
          value: "true"
        - name: ROOK_DISABLE_DEVICE_HOTPLUG
          value: "false"
        - name: ROOK_ENABLE_FLEX_DRIVER
          value: "false"
        # Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
        # This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs. --> CHANGED to false
        - name: ROOK_ENABLE_DISCOVERY_DAEMON
          value: "false"
        - name: ROOK_CSI_ENABLE_CEPHFS
          value: "true"
        - name: ROOK_CSI_ENABLE_RBD
          value: "true"
        - name: ROOK_CSI_ENABLE_GRPC_METRICS
          value: "true"
        - name: CSI_ENABLE_SNAPSHOTTER
          value: "true"
        - name: CSI_PROVISIONER_TOLERATIONS
          value: |
            - effect: NoSchedule
              key: storage-node
              operator: Exists
        - name: CSI_PLUGIN_TOLERATIONS
          value: |
            - effect: NoSchedule
              key: storage-node
              operator: Exists
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      volumes:
      - name: rook-config
        emptyDir: {}
      - name: default-config-dir
        emptyDir: {}

And here is my cluster.yaml file

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  dataDirHostPath: /var/lib/rook
  mon:
    count: 3
    allowMultiplePerNode: false
    volumeClaimTemplate:
      spec:
        storageClassName: managed-premium
        resources:
          requests:
            storage: 10Gi
  cephVersion:
    image: ceph/ceph:v15.2.4
    allowUnsupported: false
  dashboard:
    enabled: true
    ssl: true
  network:
    hostNetwork: false
  placement:
    mon:
      tolerations:
      - key: storage-node
        operator: Exists
  storage:
    storageClassDeviceSets:
    - name: set1
      # The number of OSDs to create from this device set
      count: 4
      # IMPORTANT: If volumes specified by the storageClassName are not portable across nodes
      # this needs to be set to false. For example, if using the local storage provisioner
      # this should be false.
      portable: true
      # Since the OSDs could end up on any node, an effort needs to be made to spread the OSDs
      # across nodes as much as possible. Unfortunately the pod anti-affinity breaks down
      # as soon as you have more than one OSD per node. If you have more OSDs than nodes, K8s may
      # choose to schedule many of them on the same node. What we need is the Pod Topology
      # Spread Constraints, which is alpha in K8s 1.16. This means that a feature gate must be
      # enabled for this feature, and Rook also still needs to add support for this feature.
      # Another approach for a small number of OSDs is to create a separate device set for each
      # zone (or other set of nodes with a common label) so that the OSDs will end up on different
      # nodes. This would require adding nodeAffinity to the placement here.
      placement:
        tolerations:
        - key: storage-node
          operator: Exists
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: agentpool
                operator: In
                values:
                - npstorage
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - rook-ceph-osd
                - key: app
                  operator: In
                  values:
                  - rook-ceph-osd-prepare
              topologyKey: kubernetes.io/hostname
      resources:
        limits:
          cpu: "500m"
          memory: "4Gi"
        requests:
          cpu: "500m"
          memory: "2Gi"
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          resources:
            requests:
              storage: 100Gi
          storageClassName: managed-premium
          volumeMode: Block
          accessModes:
            - ReadWriteOnce
  disruptionManagement:
    managePodBudgets: false
    osdMaintenanceTimeout: 30
    manageMachineDisruptionBudgets: false
    machineDisruptionBudgetNamespace: openshift-machine-api

This is my output:

NAME                                                              READY   STATUS             RESTARTS   AGE
csi-cephfsplugin-6xzw6                                            3/3     Running            0          34m
csi-cephfsplugin-dtncx                                            3/3     Running            0          34m
csi-cephfsplugin-provisioner-67f9c99b5f-hvgtb                     6/6     Running            0          34m
csi-cephfsplugin-provisioner-67f9c99b5f-xf2wx                     6/6     Running            0          34m
csi-cephfsplugin-t7q9g                                            3/3     Running            0          34m
csi-cephfsplugin-tccnb                                            3/3     Running            0          34m
csi-cephfsplugin-tjxs7                                            3/3     Running            0          34m
csi-cephfsplugin-wxtsr                                            3/3     Running            0          34m
csi-rbdplugin-65z9v                                               3/3     Running            0          34m
csi-rbdplugin-6kdj4                                               3/3     Running            0          34m
csi-rbdplugin-9vlwn                                               3/3     Running            0          34m
csi-rbdplugin-dvsrq                                               3/3     Running            0          34m
csi-rbdplugin-phxjr                                               3/3     Running            0          34m
csi-rbdplugin-provisioner-5d5cfb887b-4f9vh                        6/6     Running            0          34m
csi-rbdplugin-provisioner-5d5cfb887b-ww87t                        6/6     Running            0          34m
csi-rbdplugin-qr9j2                                               3/3     Running            0          34m
rook-ceph-crashcollector-aks-agentpool-25228689-vmss0000007m2mh   1/1     Running            0          32m
rook-ceph-crashcollector-aks-npstorage-25228689-vmss000000j6fvg   1/1     Running            0          30m
rook-ceph-crashcollector-aks-npstorage-25228689-vmss0000016h7bl   1/1     Running            0          32m
rook-ceph-crashcollector-aks-rstudiomed-25228689-vmss000002pddf   1/1     Running            0          31m
rook-ceph-mgr-a-7575fdb658-7n4gn                                  1/1     Running            0          31m
rook-ceph-mon-a-6d44495c59-4rqh9                                  1/1     Running            0          33m
rook-ceph-mon-b-5d9cc8bc8d-47jdw                                  1/1     Running            0          32m
rook-ceph-mon-c-d4f6bcb45-s2dfp                                   1/1     Running            0          32m
rook-ceph-operator-78f46865d8-hgnbz                               1/1     Running            0          36m
rook-ceph-osd-0-7989bc8b9-ndgzl                                   0/1     CrashLoopBackOff   10         30m
rook-ceph-osd-1-5f749bcd97-4cczm                                  0/1     CrashLoopBackOff   10         30m
rook-ceph-osd-2-58668bbb4b-68cxm                                  0/1     CrashLoopBackOff   10         30m
rook-ceph-osd-3-66844fbfb6-knvrn                                  0/1     CrashLoopBackOff   10         30m
rook-ceph-osd-prepare-set1-data-0-w6gf9-9phtm                     0/1     Completed          0          31m
rook-ceph-osd-prepare-set1-data-1-2r2wj-kpqjm                     0/1     Completed          0          31m
rook-ceph-osd-prepare-set1-data-2-mrdsz-2l84b                     0/1     Completed          0          31m
rook-ceph-osd-prepare-set1-data-3-d2mr9-xbhjx                     0/1     Completed          0          31m

And output of kubecrl describe of all crashloopbackoff pods are as follows:

kubectl describe pod -n rook-ceph rook-ceph-osd-0-7989bc8b9-ndgzl

Events:
  Type     Reason                  Age                  From                     Message
  ----     ------                  ----                 ----                     -------
  Normal   Scheduled               31m                  default-scheduler        Successfully assigned rook-ceph/rook-ceph-osd-0-7989bc8b9-ndgzl to aks-npstorage-25228689-vmss000000
  Warning  FailedAttachVolume      31m                  attachdetach-controller  Multi-Attach error for volume "pvc-5b6acdb1-c524-4b55-ae60-549471369853" Volume is already used by pod(s) rook-ceph-osd-prepare-set1-data-0-w6gf9-9phtm
  Normal   SuccessfulAttachVolume  31m                  attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-5b6acdb1-c524-4b55-ae60-549471369853"
  Normal   SuccessfulMountVolume   30m                  kubelet                  MapVolume.MapPodDevice succeeded for volume "pvc-5b6acdb1-c524-4b55-ae60-549471369853" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/azure-disk/volumeDevices/kubernetes-dynamic-pvc-5b6acdb1-c524-4b55-ae60-549471369853"
  Normal   SuccessfulMountVolume   30m                  kubelet                  MapVolume.MapPodDevice succeeded for volume "pvc-5b6acdb1-c524-4b55-ae60-549471369853" volumeMapPath "/var/lib/kubelet/pods/7b0f2572-132a-4b9b-912f-16ffe72238d9/volumeDevices/kubernetes.io~azure-disk"
  Normal   Pulled                  30m                  kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 30m                  kubelet                  Created container blkdevmapper
  Normal   Started                 30m                  kubelet                  Started container blkdevmapper
  Normal   Pulled                  30m                  kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 30m                  kubelet                  Created container activate
  Normal   Started                 30m                  kubelet                  Started container activate
  Normal   Pulled                  30m                  kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 30m                  kubelet                  Created container expand-bluefs
  Normal   Started                 30m                  kubelet                  Started container expand-bluefs
  Normal   Pulled                  30m                  kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 30m                  kubelet                  Created container chown-container-data-dir
  Normal   Started                 30m                  kubelet                  Started container chown-container-data-dir
  Normal   Started                 30m (x2 over 30m)    kubelet                  Started container osd
  Normal   Pulled                  30m (x3 over 30m)    kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 30m (x3 over 30m)    kubelet                  Created container osd
  Warning  BackOff                 40s (x148 over 30m)  kubelet                  Back-off restarting failed container

kubectl describe pod -n rook-ceph rook-ceph-osd-1-5f749bcd97-4cczm

Events:
  Type     Reason                 Age                    From               Message
  ----     ------                 ----                   ----               -------
  Normal   Scheduled              33m                    default-scheduler  Successfully assigned rook-ceph/rook-ceph-osd-1-5f749bcd97-4cczm to aks-npstorage-25228689-vmss000001
  Normal   SuccessfulMountVolume  33m                    kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-f601df09-53e1-49d2-98ef-be7253d9153e" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/azure-disk/volumeDevices/kubernetes-dynamic-pvc-f601df09-53e1-49d2-98ef-be7253d9153e"
  Normal   SuccessfulMountVolume  33m                    kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-f601df09-53e1-49d2-98ef-be7253d9153e" volumeMapPath "/var/lib/kubelet/pods/d65bef28-17d4-47db-9058-6d772432ff64/volumeDevices/kubernetes.io~azure-disk"
  Normal   Created                33m                    kubelet            Created container blkdevmapper
  Normal   Started                33m                    kubelet            Started container blkdevmapper
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container activate
  Normal   Started                33m                    kubelet            Started container activate
  Normal   Started                33m                    kubelet            Started container expand-bluefs
  Normal   Created                33m                    kubelet            Created container expand-bluefs
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container chown-container-data-dir
  Normal   Started                33m                    kubelet            Started container chown-container-data-dir
  Normal   Started                33m (x2 over 33m)      kubelet            Started container osd
  Normal   Pulled                 32m (x3 over 33m)      kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                32m (x3 over 33m)      kubelet            Created container osd
  Warning  BackOff                3m13s (x147 over 33m)  kubelet            Back-off restarting failed container

kubectl describe pod -n rook-ceph rook-ceph-osd-2-58668bbb4b-68cxm

Events:
  Type     Reason                 Age                    From               Message
  ----     ------                 ----                   ----               -------
  Normal   Scheduled              33m                    default-scheduler  Successfully assigned rook-ceph/rook-ceph-osd-2-58668bbb4b-68cxm to aks-npstorage-25228689-vmss000001
  Normal   SuccessfulMountVolume  33m                    kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-b99de71f-411b-4922-a4f2-2d3b95a782be" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/azure-disk/volumeDevices/kubernetes-dynamic-pvc-b99de71f-411b-4922-a4f2-2d3b95a782be"
  Normal   SuccessfulMountVolume  33m                    kubelet            MapVolume.MapPodDevice succeeded for volume "pvc-b99de71f-411b-4922-a4f2-2d3b95a782be" volumeMapPath "/var/lib/kubelet/pods/76b65d95-d78f-427a-913d-0bba65ea370e/volumeDevices/kubernetes.io~azure-disk"
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container blkdevmapper
  Normal   Started                33m                    kubelet            Started container blkdevmapper
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container activate
  Normal   Started                33m                    kubelet            Started container activate
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container expand-bluefs
  Normal   Started                33m                    kubelet            Started container expand-bluefs
  Normal   Pulled                 33m                    kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m                    kubelet            Created container chown-container-data-dir
  Normal   Started                33m                    kubelet            Started container chown-container-data-dir
  Normal   Started                33m (x2 over 33m)      kubelet            Started container osd
  Normal   Pulled                 33m (x3 over 33m)      kubelet            Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                33m (x3 over 33m)      kubelet            Created container osd
  Warning  BackOff                3m43s (x148 over 33m)  kubelet            Back-off restarting failed container

kubectl describe pod -n rook-ceph rook-ceph-osd-3-66844fbfb6-knvrn

Events:
  Type     Reason                  Age                    From                     Message
  ----     ------                  ----                   ----                     -------
  Normal   Scheduled               34m                    default-scheduler        Successfully assigned rook-ceph/rook-ceph-osd-3-66844fbfb6-knvrn to aks-npstorage-25228689-vmss000000
  Warning  FailedAttachVolume      34m                    attachdetach-controller  Multi-Attach error for volume "pvc-a4a1291f-743f-46fe-8234-5e67b8d053ee" Volume is already used by pod(s) rook-ceph-osd-prepare-set1-data-2-mrdsz-2l84b
  Normal   SuccessfulAttachVolume  33m                    attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-a4a1291f-743f-46fe-8234-5e67b8d053ee"
  Normal   SuccessfulMountVolume   33m                    kubelet                  MapVolume.MapPodDevice succeeded for volume "pvc-a4a1291f-743f-46fe-8234-5e67b8d053ee" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/azure-disk/volumeDevices/kubernetes-dynamic-pvc-a4a1291f-743f-46fe-8234-5e67b8d053ee"
  Normal   SuccessfulMountVolume   33m                    kubelet                  MapVolume.MapPodDevice succeeded for volume "pvc-a4a1291f-743f-46fe-8234-5e67b8d053ee" volumeMapPath "/var/lib/kubelet/pods/5adfbb90-da86-41de-aa0e-50fd3f7104f1/volumeDevices/kubernetes.io~azure-disk"
  Normal   Pulled                  33m                    kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 33m                    kubelet                  Created container blkdevmapper
  Normal   Started                 33m                    kubelet                  Started container blkdevmapper
  Normal   Pulled                  33m                    kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 33m                    kubelet                  Created container activate
  Normal   Started                 33m                    kubelet                  Started container activate
  Normal   Pulled                  33m                    kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 33m                    kubelet                  Created container expand-bluefs
  Normal   Started                 33m                    kubelet                  Started container expand-bluefs
  Normal   Pulled                  33m                    kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 33m                    kubelet                  Created container chown-container-data-dir
  Normal   Started                 33m                    kubelet                  Started container chown-container-data-dir
  Normal   Started                 33m (x2 over 33m)      kubelet                  Started container osd
  Normal   Pulled                  33m (x3 over 33m)      kubelet                  Container image "ceph/ceph:v15.2.4" already present on machine
  Normal   Created                 33m (x3 over 33m)      kubelet                  Created container osd
  Warning  BackOff                 3m23s (x145 over 33m)  kubelet                  Back-off restarting failed container

Any help would be appreciated!

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 17 (6 by maintainers)

Most upvoted comments

Thanks @travisn for resolving the issue!

Siddhu1096 on Nov 19, 2020

There is a chance we may want to expose the NFS home directories to other VMs outside the AKS cluster but in the same network. Would that be possible/recommended with CephNFS?

@vergilcw If in the same network, I would expect it to work, but haven’t tried it.

travisn on Nov 16, 2020

@travisn thanks for helping my colleague @Siddhu1096 solve this problem!

To answer your question about NFS requirements, we need high performance NFS in the cluster for shared user home directories. The application (RStudio Server) has lots of I/O on small files in users’ home directories. We thought CephNFS looked promising for performance reasons.

There is a chance we may want to expose the NFS home directories to other VMs outside the AKS cluster but in the same network. Would that be possible/recommended with CephNFS?

vergilcw on Nov 13, 2020

We are planning to add NFS to the ceph cluster. And when we checked rook documents, we found Ceph NFS CRD and NFS Server CRD. What is the difference between these two? Which approach would be the best for an AKS Cluster?

I’d recommend the CephNFS CRD since it’s more directly integrated with Ceph. The other one is a more general NFS solution that can be backed by any storage.

What is the need for NFS? Are you exposing the storage outside the AKS cluster?

travisn on Nov 13, 2020

The zone comes from the topology labels on the nodes. They are not set by Rook, Rook just consumes them.

Are you also using the latest release instead of master like my earlier suggestion?

travisn on Nov 12, 2020