operator: name of volumeMount in vmstorage StatefulSet does not follow name of volumeClaimTemplate

For example, when I give a VMCluster like:

apiVersion: operator.victoriametrics.com/v1beta1
kind: VMCluster
metadata:
  name: example-cluster
  namespace: monitoring
spec:
  vmstorage:
    storage:
      volumeClaimTemplate:
        metadata:
          name: foo-db    # THIS!!
        spec:
          storageClassName: ceph-hdd-block
          resources:
            requests:
              storage: 3Gi

  vmselect:
    replicaCount: 1

  vminsert:
    replicaCount: 1

… the resulting StatefulSet for vmstorage becomes:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: vmstorage-example-cluster
  namespace: monitoring
spec:
  template:
    spec:
      containers:
        - args:
          (snip)
          volumeMounts:
            - mountPath: vmstorage-data
              name: vmstorage-db    # XXX
  volumeClaimTemplates:
    - apiVersion: v1
      kind: PersistentVolumeClaim
      metadata:
        name: foo-db    # The name given in VMCluster.spec.vmstorage.storage.volumeClaimTemplate.metadata.name
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 3Gi
        storageClassName: ceph-hdd-block
        volumeMode: Filesystem
      status:
        phase: Pending

… and StatefulSet controller cannot create Pods due to volume name mismatch.

According to https://github.com/VictoriaMetrics/operator/blob/v0.19.1/controllers/factory/alertmanager.go#L571 , it seems that the operator allows renaming volumeClaimTemplate. However, .spec.template.spec.containers[0].volumeMounts[0].name of the resulting StatefulSet does not follow the volumeClaimTemplate’s name. I think it is a problem.

VMSelect and VMAlertmanager, which also create StatefulSets, may have the same problem but I don’t have checked them.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 17 (7 by maintainers)

Commits related to this issue

Most upvoted comments

Its a bit more complicated, that i’ve thought. Need few more time to fix it.

The main cause, when you delete Statefulset for re-creation ( when storage changed for instance). Kubernetes controller doesn’t update it’s status fields. It happens, when all pods are deleted and created again. If there was some issue with pod re-creation (timeout or any other), pods are re-recreated again and it falls into infinite loop.

I have some thoughts, how to fix it.

I tested dd71fec. (I tested only vmstorage). The resulting StatefulSet looked fine.

However, after I modified VMCluster to change volumeClaimTemplate’s name and the StatefulSet was updated, Pod recreation sometimes worked fine and sometimes didn’t work: in the latter case, vmstorage Pods repeated starting and terminating.

hmm, do you have some reproducible example? For me it works, looks strange… I’ll try to catch it.

I tested dd71fec8e18c52a4a5d3fe43472dbd8a3825aa28. (I tested only vmstorage). The resulting StatefulSet looked fine.

However, after I modified VMCluster to change volumeClaimTemplate’s name and the StatefulSet was updated, Pod recreation sometimes worked fine and sometimes didn’t work: in the latter case, vmstorage Pods repeated starting and terminating.