kubernetes: resize pv failed

What happened: kubectl create -f ceph-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-expand-test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ceph-storage

and result is ok pvc-expand-test Bound pvc-962a674c-0a73-11e9-b2d8-0050569bfc0f 1Gi RWO ceph-storage 1h

now i want to resize pv from 1G to 2G, edit

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-expand-test
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: ceph-storage

kubectl describe pvc pvc-expand-test

Name:          pvc-expand-test
Namespace:     default
StorageClass:  ceph-storage
Status:        Bound
Volume:        pvc-962a674c-0a73-11e9-b2d8-0050569bfc0f
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed=yes
               pv.kubernetes.io/bound-by-controller=yes
               volume.beta.kubernetes.io/storage-provisioner=ceph.com/rbd
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      1Gi
Access Modes:  RWO
Conditions:
  Type       Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----       ------  -----------------                 ------------------                ------  -------
  Resizing   True    Mon, 01 Jan 0001 00:00:00 +0000   Fri, 28 Dec 2018 15:40:04 +0800           
Events:
  Type     Reason              Age                From           Message
  ----     ------              ----               ----           -------
  Warning  VolumeResizeFailed  24s (x21 over 1h)  volume_expand  Error expanding volume "default/pvc-expand-test" of plugin kubernetes.io/rbd : rbd info failed, error: can not get image size info kubernetes-dynamic-pvc-962c05a3-0a73-11e9-951a-0a580af404fa: rbd image 'kubernetes-dynamic-pvc-962c05a3-0a73-11e9-951a-0a580af404fa':
           size 1 GiB in 256 objects
           order 22 (4 MiB objects)
           id: 374526b8b4567
           block_name_prefix: rbd_data.374526b8b4567
           format: 2
           features: 
           op_features: 
           flags: 
           create_timestamp: Fri Dec 28 07:38:36 2018

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: ceph storageclass yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: ceph-storage
provisioner: ceph.com/rbd
parameters:
  monitors: 192.168.20.195:6789,192.168.20.196:6789,192.168.20.197:6789
  adminId: admin
  adminSecretNamespace: default
  adminSecretName: ceph-secret
  pool: k8s-rbd
  userId: admin
  userSecretName: ceph-secret
  fsType: ext4
  imageFormat: "2"
allowVolumeExpansion: true

ceph secret:


apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret
type: "kubernetes.io/rbd"
data:
  key: QVFCaVhBWmNJcdfa1QTJmZXJRRmNRLzBtSnlYZ1BEdmlMakE9PQ==

Environment:

  • Kubernetes version (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“11”, GitVersion:“v1.11.5”, GitCommit:“753b2dbc622f5cc417845f0ff8a77f539a4213ea”, GitTreeState:“clean”, BuildDate:“2018-11-26T14:41:50Z”, GoVersion:“go1.10.3”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“11”, GitVersion:“v1.11.5”, GitCommit:“753b2dbc622f5cc417845f0ff8a77f539a4213ea”, GitTreeState:“clean”, BuildDate:“2018-11-26T14:31:35Z”, GoVersion:“go1.10.3”, Compiler:“gc”, Platform:“linux/amd64”}

  • Cloud provider or hardware configuration:

  • OS (e.g. from /etc/os-release): CentOS Linux release 7.4.1708 (Core)

  • Kernel (e.g. uname -a): Linux m01 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools: kubeadm

  • Others: ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable)

/kind bug

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 2
  • Comments: 29 (9 by maintainers)

Commits related to this issue

Most upvoted comments

It seems that the output format is different between different ceph versions. I will try to fix it.

@mlmhl It seems this change is not working in some cases:

  Type     Reason              Age                    From           Message
  ----     ------              ----                   ----           -------
  Warning  VolumeResizeFailed  100s (x53 over 3h29m)  volume_expand  (combined from similar events): Error expanding volume "namespace/prometheus1-server" of plugin kubernetes.io/rbd : rbd info failed, error: parse rbd info output failed: 2019-01-28 16:47:24.878565 7f7eef611100 -1 did not load config file, using default settings.
{"name":"kubernetes-dynamic-pvc-155ee065-4ede-11e8-b665-02420a141559","size":536870912000,"objects":128000,"order":22,"object_size":4194304,"block_name_prefix":"rbd_data.259f7374b0dc51","format":2,"features":[],"flags":[]}, invalid character '-' after top-level value

I think this is due to two concurrent conditions: 1- ceph client still complain on stderr about not being able to load config file even if there is no need as a key is supplied and the result is (hence the output on stderr of 2019-01-28 16:47:24.878565 7f7eef611100 -1 did not load config file, using default settings.). This happens not only in hyperkube. 2- it seems we capture both stdout and stderr from the execution of the rbd info command (please note the output in the error from exec.Run) and then there is then a problem trying to unmarshall the JSON (to be clear, the unuseful warning is on stderr, the json output on stdout).

I am able to have a workaround creating empty files for /etc/ceph/ceph.conf and /etc/ceph/ceph.keyring so that the ceph client does not output anymore but I feel it would be much better to parse only stdout.

A reference on the source code for #72431 https://github.com/kubernetes/kubernetes/pull/72431/files#diff-36f18e327f36d95eb1333c4a18781184R704

Thanks edit:typo

Are there plans to resolve this issue?

@hasonhai Inside the pod of the kube-controller-manager as the rbd resize is executed by this component.