operator: name of volumeMount in vmstorage StatefulSet does not follow name of volumeClaimTemplate
For example, when I give a VMCluster like:
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMCluster
metadata:
name: example-cluster
namespace: monitoring
spec:
vmstorage:
storage:
volumeClaimTemplate:
metadata:
name: foo-db # THIS!!
spec:
storageClassName: ceph-hdd-block
resources:
requests:
storage: 3Gi
vmselect:
replicaCount: 1
vminsert:
replicaCount: 1
… the resulting StatefulSet for vmstorage becomes:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: vmstorage-example-cluster
namespace: monitoring
spec:
template:
spec:
containers:
- args:
(snip)
volumeMounts:
- mountPath: vmstorage-data
name: vmstorage-db # XXX
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: foo-db # The name given in VMCluster.spec.vmstorage.storage.volumeClaimTemplate.metadata.name
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
storageClassName: ceph-hdd-block
volumeMode: Filesystem
status:
phase: Pending
… and StatefulSet controller cannot create Pods due to volume name mismatch.
According to https://github.com/VictoriaMetrics/operator/blob/v0.19.1/controllers/factory/alertmanager.go#L571 , it seems that the operator allows renaming volumeClaimTemplate. However, .spec.template.spec.containers[0].volumeMounts[0].name of the resulting StatefulSet does not follow the volumeClaimTemplate’s name. I think it is a problem.
VMSelect and VMAlertmanager, which also create StatefulSets, may have the same problem but I don’t have checked them.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 17 (7 by maintainers)
Commits related to this issue
- fixes incorrect handling for storageSpec.metadata.name https://github.com/VictoriaMetrics/operator/issues/344 now it properly handled - in case of change, statefulset recreates with new claim. refact... — committed to VictoriaMetrics/operator by f41gh7 3 years ago
- fixes statefulset spec.name and vmagent secret finalizer (#347) * fixes deletion for vmagent config secret finalizer was not properly removed https://github.com/VictoriaMetrics/operator/issues/343 ... — committed to VictoriaMetrics/operator by f41gh7 3 years ago
- adds new logic for statefulset pod re-creation now it must properly handle cases for pods rolling update interuption when statefulset was re-created In this case, currentRevision will be written to t... — committed to VictoriaMetrics/operator by f41gh7 3 years ago
- controller/factory/k8stools: optimize `HandleSTSUpdate` 1. even if statefulset needs recreate because VolumeClaimTemplates changed, sometimes pod doesn't need to be recreated. Since pod's volume only ... — committed to VictoriaMetrics/operator by Haleygo 8 months ago
- controller/factory/k8stools: optimize `HandleSTSUpdate` 1. even if statefulset needs recreate because VolumeClaimTemplates changed, sometimes pod doesn't need to be recreated. Since pod's volume only ... — committed to VictoriaMetrics/operator by Haleygo 8 months ago
- controller/factory/k8stools: optimize `HandleSTSUpdate` 1. even if statefulset needs recreate because VolumeClaimTemplates changed, sometimes pod doesn't need to be recreated. Since pod's volume only ... — committed to VictoriaMetrics/operator by Haleygo 8 months ago
- controller/factory/k8stools: optimize `HandleSTSUpdate` 1. even if statefulset needs recreate because VolumeClaimTemplates changed, sometimes pod doesn't need to be recreated. Since pod's volume only ... — committed to VictoriaMetrics/operator by Haleygo 8 months ago
- controller/factory/k8stools: optimize `HandleSTSUpdate` (#801) 1. even if statefulset needs recreate because VolumeClaimTemplates changed, sometimes pod doesn't need to be recreated. Since pod's volu... — committed to VictoriaMetrics/operator by Haleygo 8 months ago
Its a bit more complicated, that i’ve thought. Need few more time to fix it.
The main cause, when you delete Statefulset for re-creation ( when storage changed for instance). Kubernetes controller doesn’t update it’s status fields. It happens, when all pods are deleted and created again. If there was some issue with pod re-creation (timeout or any other), pods are re-recreated again and it falls into infinite loop.
I have some thoughts, how to fix it.
hmm, do you have some reproducible example? For me it works, looks strange… I’ll try to catch it.
I tested dd71fec8e18c52a4a5d3fe43472dbd8a3825aa28. (I tested only vmstorage). The resulting StatefulSet looked fine.
However, after I modified VMCluster to change volumeClaimTemplate’s name and the StatefulSet was updated, Pod recreation sometimes worked fine and sometimes didn’t work: in the latter case, vmstorage Pods repeated starting and terminating.