prometheus-operator: Updating a persistent volume size (GKE 1.11) does nothing

What did you do?

I had a volume claim set up in my Prometheus resource:

    storage:
      volumeClaimTemplate:
        metadata:
          labels:
            prometheus: k8s
          name: prometheus-storage
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 150Gi
          storageClassName: prometheus-ssd

I tried updating the storage.volumeClaimTemplate.spec.resources.requests.storage from 150Gi to 250Gi. The operator restarted my Prometheus statefulset pods, but the associated persistent volume claim stayed at 150Gi.

What did you expect to see?

I have defined my persistent storage to have allowVolumeExpansion: true. This is a kubernetes 1.11 feature. I expected that the PVC that is managed by prometheus-operator would get updated with the new value of 250Gi, and that the pods would then restart and re-mount with the expanded volumes.

Environment

Prometheus Operator version: 0.26.0
Kubernetes version information: v1.11.6-gke.3
Kubernetes cluster kind: GKE

Manifests Storage Class:

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: prometheus-ssd
parameters:
  type: pd-ssd
provisioner: kubernetes.io/gce-pd
reclaimPolicy: Delete
volumeBindingMode: Immediate

About this issue

Original URL
State: open
Created 5 years ago
Reactions: 32
Comments: 43 (19 by maintainers)

Commits related to this issue

Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guides" section with the "Getting Started" and "Alerting" guides.... — committed to simonpasquier/prometheus-operator by simonpasquier 2 years ago
Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guides" section with the "Getting Started" and "Alerting" guides.... — committed to simonpasquier/prometheus-operator by simonpasquier 2 years ago
Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guides" section with the "Getting Started" and "Alerting" guides.... — committed to simonpasquier/prometheus-operator by simonpasquier 2 years ago
Documentation: add more content to online docs (#5060) * Documentation: add more content to online docs This change adds the following content to prometheus-operator.dev website: * New "User Guid... — committed to prometheus-operator/prometheus-operator by simonpasquier 2 years ago

Most upvoted comments

Hello! I just want to up this issue, because it is really annoying to edit all PVCs manually and rekick all pods manually when we have the operator.

m-messiah on Aug 18, 2021

Hello, Just ran into this exact issue. I had to manually edit each prometheus PVC to adjust size. Resizing was fully transparent. I then updated prometheus-oprator Helm Release. The operator killed all Prometheus at once, instead of the usual rollout.

I also catches this message in operator logs:

level=info ts=2020-03-25T08:39:25.321349082Z caller=operator.go:1180 component=prometheusoperator msg="resolving illegal update of Prometheus StatefulSet" details="&StatusDetails{Name:prometheus-prometheus-operator-prometheus,Group:apps,Kind:StatefulSet,Causes:[]StatusCause{StatusCause{Type:FieldValueForbidden,Message:Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', and 'updateStrategy' are forbidden,Field:spec,},},RetryAfterSeconds:0,UID:,}"

jbfavre on Mar 25, 2020

I think this is more of a missing Kubernetes feature than a prometheus-operator related one. StatefulSets don’t yet fully support expanding volumes. Related enhancement proposal: https://github.com/kubernetes/enhancements/pull/660

tkornai on Sep 19, 2019

@parkjeongryul let’s continue the discussion in https://github.com/prometheus-operator/prometheus-operator/issues/5289

simonpasquier on Jan 20, 2023

FYI how to manually resize volumes is now documented at https://prometheus-operator.dev/docs/operator/storage/#resizing-volumes.

simonpasquier on Oct 7, 2022

We had the same subject, we followed those steps:

Update the Prometheus field spec.storage.volumeClaimTemplate.spec.resources.requests.storage to NEW-SIZE
Patch every PVC with the following command:

kubectl patch pvc/prometheus-pvc-X --patch '{"spec": { "resources": { "requests": { "storage": "NEW-SIZE" } } } }'

Delete the STS to update its definition, the recreation is done right away by the Prometheus operator

kubectl delete sts/prometheus-kube-prometheus-stack-prometheus --cascade=orphan

A fix directly in the prometheus operator would be a great addition 👍

davinkevin on Aug 11, 2022

We ended up deleting the PVC and therefore losing the historic data…

Nexus2k on Jul 14, 2022

actually encounter the same issue now with GKE 1.17 and with quay.io/prometheus-operator/prometheus-operator:v0.43.2 and quay.io/prometheus/prometheus:v2.22.1.

So if I update the capacity inside the prometheus custom resource then the statefulset and pods restarted but no change in PVC size at all.

Only if i kubectl edit pvc PVC1 and extend the capacity, then after less then 1 minute the PVC size change and also the pod is automatically expend the filesystem (with no pod restart).

So is there a way to do it from the customer resource of prometheus? and what is the best practice after changing the PVC size, should I update also the capacity in the customer resource as well (and hit with pod restart?)?

@jalev @aiman-alsari

shay-berman on Jun 15, 2021

@jalev your last request seems to be related to https://github.com/prometheus-operator/prometheus-operator/issues/2753 which is also something that would need to be address upstream.

simonpasquier on Dec 4, 2020

I needed to expand the PVC from 500Gi to 1000Gi and came across a similar issue on AWS, using storageClass: wait-consumer-gp2 with allowVolumeExpansion: true. There are two issues:

The statefulSet resource definition was successfully updated from 500Gi to 1000Gi but the PVC remained at 500Gi. I had to edit the PVC resource manually to kickstart the expansion process.
Lack of proper rollout when modifying the statefulSet. Modifying the statefulSet causes downtime, as all prometheus instances are rotated at the same time.

Regarding (1), should I open a new AWS specific feature request, is anyone working on these? I am not sure if (2) is part of the operator’s implementation or a kubernetes issue.

atmosx on Aug 14, 2020

As a complimentary comment, in order to expand the volume I needed to do it manually modifying the PVC/s directly. The resize happens and Prometheus starts healthy. I’ve modified the VolumeTemplate definition in the Prometheus spec prior to the pvc modification. Don’t know if there will be a problem with this but I doubt is as is exactly the same referenced PVC.

richerve on May 14, 2019