kubernetes: How to use RBD volumes (pod fails to start with error "rbd: failed to modprobe rbd")

Hello kubernetes,

I am trying to follow the instructions from the rbd example. After successfully booting a ceph demo cluster (sudo ceph -s on the host displays HEALTH_OK) and manually creating a foo rdb volume formatted in ext4, I cannot start any pod that uses rdb volumes.

The rdb2 pod never starts, it stays in ContainerCreating state, as shown by kubectl get pod output below:

NAME                   READY     STATUS              RESTARTS   AGE
k8s-etcd-127.0.0.1     1/1       Running             0          49m
k8s-master-127.0.0.1   4/4       Running             4          50m
k8s-proxy-127.0.0.1    1/1       Running             0          49m
rbd2                   0/1       ContainerCreating   0          9m

I am using kubernetes 1.2.1 with docker 1.9.1 on ubuntu 14.04 amd64 host using the single-node docker cluster.

The output of kubectl describe pods rbd2 is the following:

Name:       rbd2
Namespace:  default
Node:       127.0.0.1/127.0.0.1
Start Time: Wed, 06 Apr 2016 18:38:22 +0200
Labels:     <none>
Status:     Pending
IP:     
Controllers:    <none>
Containers:
  rbd-rw:
    Container ID:   
    Image:      nginx
    Image ID:       
    Port:       
    QoS Tier:
      cpu:      BestEffort
      memory:       BestEffort
    State:      Waiting
      Reason:       ContainerCreating
    Ready:      False
    Restart Count:  0
    Environment Variables:
Conditions:
  Type      Status
  Ready     False 
Volumes:
  rbdpd:
    Type:       RBD (a Rados Block Device mount on the host that shares a pod's lifetime)
    CephMonitors:   [172.17.42.1:6789]
    RBDImage:       foo
    FSType:     ext4
    RBDPool:        rbd
    RadosUser:      admin
    Keyring:        /etc/ceph/ceph.client.admin.keyring
    SecretRef:      &{ceph-secret}
    ReadOnly:       true
  default-token-1ze78:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-1ze78
Events:
  FirstSeen LastSeen    Count   From            SubobjectPath   Type        Reason      Message
  --------- --------    -----   ----            -------------   --------    ------      -------
  7m        7m      1   {default-scheduler }            Normal      Scheduled   Successfully assigned rbd2 to 127.0.0.1
  7m        7s      33  {kubelet 127.0.0.1}         Warning     FailedMount Unable to mount volumes for pod "rbd2_default(fa59e744-fc15-11e5-8533-28d2444cbe8c)": rbd: failed to modprobe rbd error:exit status 1
  7m        7s      33  {kubelet 127.0.0.1}         Warning     FailedSync  Error syncing pod, skipping: rbd: failed to modprobe rbd error:exit status 1

In the kubelet docker log, I can see the following trace, repeated multiple times.

I0406 16:44:56.885150    8236 rbd.go:89] ceph secret info: key/AQCyJQVXJV4gERAA1q7y4Wi6MiuO8UahSQoIrg==
I0406 16:44:56.887715    8236 nsenter_mount.go:179] Failed findmnt command: exit status 1
E0406 16:44:57.889282    8236 disk_manager.go:56] failed to attach disk
E0406 16:44:57.889295    8236 rbd.go:208] rbd: failed to setup
E0406 16:44:57.889334    8236 kubelet.go:1780] Unable to mount volumes for pod "rbd2_default(fa59e744-fc15-11e5-8533-28d2444cbe8c)": rbd: failed to modprobe rbd error:exit status 1; skipping pod
E0406 16:44:57.889340    8236 pod_workers.go:138] Error syncing pod fa59e744-fc15-11e5-8533-28d2444cbe8c, skipping: rbd: failed to modprobe rbd error:exit status 1
I0406 16:44:58.884709    8236 nsenter_mount.go:179] Failed findmnt command: exit status 1

As I understand the above logs, the kubelet container is trying to run something like modprobe rbd inside itself (or somewhere else?) and that fails; I noticed that there is no modprobe command inside the kubelet container (image: gcr.io/google_containers/hyperkube-amd64:v1.2.1) so I manually ran apt-get update && apt-get install kmod to make that command appear inside the container, but without success).

My files look like this:

# secret/ceph-secret.yaml 
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret
data:
  key: QVFDeUpRVlhKVjRnRVJBQTFxN3k0V2k2TWl1TzhVYWhTUW9Jcmc9PQo=
# rbd-pod.yaml 
apiVersion: "v1"
kind: "Pod"
metadata: 
  name: "rbd2"
spec: 
  containers: 
    - name: "rbd-rw"
      image: "nginx"
      volumeMounts: 
        - mountPath: "/var/www/html"
          name: "rbdpd"
  volumes: 
    - name: "rbdpd"
      rbd: 
        monitors: 
          - "172.17.42.1:6789"
        pool: "rbd"
        image: "foo"
        user: "admin"
        secretRef: 
          name: "ceph-secret"
        fsType: "ext4"
        keyring: "/etc/ceph/ceph.client.admin.keyring"
        readOnly: true

I have checked that 172.17.42.1:6789 is reachable from the kubernetes cluster (because of using --net=host when booting the kubelet container).

How can I mount RBD volumes inside container as of kubernetes 1.2.1?

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 27 (14 by maintainers)

Commits related to this issue

Most upvoted comments

Also dealing with “Could not map image: Timeout after 10s”. Is there a solution?

If i bind-mount /dev on the host to /dev on the kubelet container, the kubelet container will mess up with the pts on my host, resulting in being unable to start new terminals (gnome-terminal will fail with the “getpt failed” message) and making impossible to properly shutdown my workstations.

EDIT: after checking the issue tracker, the breakage resulting from bind-mounting /dev from host into the kubelet container is documented in #18230.

What I finally did was to wrap the hyperkube image like this:

# apply hacks from https://github.com/kubernetes/kubernetes/issues/23924#issuecomment-206803980
# so that pods that use rbd persistent resources work in the single-node docker setup.
# Build with the following command: `docker build -t custom/hyperkube-amd64:v1.2.1 .`

FROM gcr.io/google_containers/hyperkube-amd64:v1.2.1

RUN curl https://raw.githubusercontent.com/ceph/ceph/master/keys/release.asc | apt-key add - && \
    echo deb http://download.ceph.com/debian-hammer/ jessie main | tee /etc/apt/sources.list.d/ceph.list && \
    apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -q -y ceph-common && \
    apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

And then run the kubelet container like this:

    docker run \
      --volume=/:/rootfs:ro \
      --volume=/sys:/sys:rw \                                               # necessary to do mount from container
      --volume=/var/lib/docker/:/var/lib/docker:rw \
      --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
      --volume=/var/run:/var/run:rw \
      --volume=/sbin/modprobe:/sbin/modprobe:ro \          # to skip having to install in container
      --volume=/lib/modules:/lib/modules:ro \                     # to make `modprobe rbd` work
      --volume=/etc/ceph:/etc/ceph:ro \                             # to copy ceph config from host
      --volume=/dev/rbd0:/rootfs/dev/rbd0:ro \                  # workaround for point 3 above
      --net=host \
      --pid=host \
      --privileged=true \
      --name=kubelet \
      -d \
      custom/hyperkube-amd64:v${K8S_VERSION} \        # image with ceph-common vendored-in
      /hyperkube kubelet \
      --containerized \
      --hostname-override="127.0.0.1" \
      --address="0.0.0.0" \
      --api-servers=http://localhost:8080 \
      --config=/etc/kubernetes/manifests \
      --cluster-dns=10.0.0.10 \
      --cluster-domain=cluster.local \
      --allow-privileged=true --v=2

Then I can use rbd persistent volumes from my dockerized kubernetes setup.

modprobe rbd failure is the problem. What kubelet container image you are using? can you install modprobe in your image?

@jperville have you tried apt-get installing the ceph-common package?