kubernetes: Ceph RBD expansion fails

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug

What happened: I was stuck at an issue while trying to resize Ceph RBD persistent volumes on K8s 1.11 provisioned using kubeadm on Centos 7.5 AMIs on AWS. I am using the rbd-provisioner deployment as specified in the kubernetes-incubator project(https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd/deploy) and have a ‘rbd’ storage class to dynamically provision volumes to my pods. My mysql pod is able to attach the volume using a PVC. But when i try to edit the PVC to and increase its capacity, the PVC remains stuck in the status - ‘Resizing’. When I describe the PVC, the output I get is — Warning VolumeResizeFailed 54m (x8 over 1h) volume_expand Error expanding volume “default/mysql-pv-claim” of plugin kubernetes.io/rbd : rbd info failed, error: executable file not found in $PATH.

What you expected to happen: PVC expansion succeeds

How to reproduce it (as minimally and precisely as possible):

  1. Create a ceph RBD storage class
  2. Create a PVC for the storage class and attach it to a pod
  3. Edit the PVC object and set the storage field to a higher value.

Anything else we need to know?:

Environment: Kubernetes on CentOS 7.5 AMI on AWS with ceph-common installed and rbd-provisioner pod deployed into kube-system namespace from example - (https://github.com/kubernetes-incubator/external-storage/tree/master/ceph/rbd/deploy)

  • Kubernetes version (use kubectl version): v1.11.3
  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): CentOS Linux release 7.5.1804 (Core)
  • Kernel (e.g. uname -a): Linux 4.9.112-32.el7.x86_64
  • Install tools: Kubeadm
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 25 (3 by maintainers)

Most upvoted comments

So, related to https://github.com/kubernetes/kubernetes/issues/38923#issuecomment-313054666, volume provisioners from in-tree to out-of-tree not fully moved, we still need rbd in kube-controller-manager container when resize volume.

Can we make it better?

@z8772083, @prakashmishra1598, I solved this problem in v1.13.2 deployed by kubeadm, using the out-of-tree rbd-provisioner, as well as a hyperkube-based controller-manager (which contains the RBD tools in the released version). This is apparently the configuration that core Kubernetes uses for testing RBD support.

Basically, I needed to edit /etc/kubernetes/manifests/kube-controller-manager.yaml:

[...]
spec:
  containers:
  - command:
    # - kube-controller-manager
    - /controller-manager # Need this line to be the first one after command
    [...]
  # image: k8s.gcr.io/kube-controller-manager:v1.13.2
  image: gcr.io/google_containers/hyperkube:v1.13.2 # include /usr/bin/rbd
[...]

This configuration should be supported for future versions, too. Can I suggest that kubeadm uses it instead of the stripped-down kube-controller-manager container?

Thanks, Michael.

Latest kubespray removed the option to deploy hyperkube as container. With rbd binary removed and K8s/Ceph nodes are running on CoreOS, is there anyway to rbd work? I tried ceph/container but it did not work.

Same here, we currently use hyperkube images to make RBD image resizing work. How should this be handled without RBD binaries in controller manager? Shouldn’t the resizing be delegated to the external tree provisioner (which has the rbd binary)?

Latest kubespray removed the option to deploy hyperkube as container. With rbd binary removed and K8s/Ceph nodes are running on CoreOS, is there anyway to rbd work? I tried ceph/container but it did not work.

when i user provisioner and custom kube-controller manager container image with rbd binaries erro happened

kubectl describe pvc pvc-expand-test

Name:          pvc-expand-test
Namespace:     default
StorageClass:  ceph-storage
Status:        Bound
Volume:        pvc-962a674c-0a73-11e9-b2d8-0050569bfc0f
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed=yes
               pv.kubernetes.io/bound-by-controller=yes
               volume.beta.kubernetes.io/storage-provisioner=ceph.com/rbd
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
Conditions:
  Type       Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----       ------  -----------------                 ------------------                ------  -------
  Resizing   True    Mon, 01 Jan 0001 00:00:00 +0000   Fri, 28 Dec 2018 15:40:04 +0800           
Events:
  Type     Reason              Age               From           Message
  ----     ------              ----              ----           -------
  Warning  VolumeResizeFailed  1m (x27 over 1h)  volume_expand  Error expanding volume "default/pvc-expand-test" of plugin kubernetes.io/rbd : rbd info failed, error: can not get image size info kubernetes-dynamic-pvc-962c05a3-0a73-11e9-951a-0a580af404fa: rbd image 'kubernetes-dynamic-pvc-962c05a3-0a73-11e9-951a-0a580af404fa':
           size 1 GiB in 256 objects
           order 22 (4 MiB objects)
           id: 374526b8b4567
           block_name_prefix: rbd_data.374526b8b4567
           format: 2
           features: 
           op_features: 
           flags: 
           create_timestamp: Fri Dec 28 07:38:36 2018


do you know why?@prakashmishra1598

@zh168654 I did a few minor changes to get it to work. I created a custom kube-controller manager container image with rbd binaries on it and it worked. I’ve pushed the image to docker hub (https://hub.docker.com/r/prakashmishra1598/centos-rbd-kube-controller-manager/).