argo-workflows: Cron workflow can not apply volumeClaimTemplates config from argo controller config map (workflowDefaults)

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I’d like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Expected: cron workflow is started correctly What’s happening:

map: map[metadata:map[creationTimestamp:<nil> name:pyspark-dir] spec:map[accessModes:[ReadWriteOnce] resources:map[requests:map[storage:350Gi]] storageClassName:xry-storage] status:map[]] does not contain declared merge key: name

It works ok with non-cron type workflows.

Version

v3.4.8

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don’t enter a workflows that uses private images.

Argo Controller config map (volumeClaimTemplates):

apiVersion: v1
data:
  config: |
    instanceID: xry
    workflowDefaults:
      spec:
        podGC:
          strategy: OnWorkflowCompletion
        ttlStrategy:
          secondsAfterCompletion: 8640000
        volumeClaimGC:
          strategy: OnWorkflowCompletion
        volumeClaimTemplates:
          - metadata:
              name: pyspark-dir
              creationTimestamp: null
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 350Gi
              storageClassName: xry-storage
    nodeEvents:
      enabled: true
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: argo-workflows
    meta.helm.sh/release-namespace: app
  creationTimestamp: "2023-07-12T06:57:16Z"
  labels:
    app.kubernetes.io/component: workflow-controller
    app.kubernetes.io/instance: argo-workflows
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: argo-workflows-cm
    app.kubernetes.io/part-of: argo-workflows
    helm.sh/chart: argo-workflows-0.30.0
  name: argo-workflows-workflow-controller-configmap
  namespace: app
  resourceVersion: "46107870"
  uid: 5d28f4b4-037c-4d96-8c69-d2fcf93e3b68

And just a default cron workflow with volumeMounts:

metadata:
  name: argo-apply-pvc-from-workflow-defaults
  namespace: app
  uid: 73c5a41c-716c-4179-ab7a-ea97c2b9fab1
  resourceVersion: '46116295'
  generation: 11
  creationTimestamp: '2023-07-12T14:18:23Z'
  labels:
    example: 'true'
    workflows.argoproj.io/controller-instanceid: xry
    workflows.argoproj.io/creator: system-serviceaccount-app-argo-workflows-server
  annotations:
    cronworkflows.argoproj.io/last-used-schedule: '* * * * *'
  managedFields:
    - manager: argo
      operation: Update
      apiVersion: argoproj.io/v1alpha1
      time: '2023-07-12T14:20:15Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:example: {}
            f:workflows.argoproj.io/controller-instanceid: {}
            f:workflows.argoproj.io/creator: {}
        f:spec: {}
    - manager: workflow-controller
      operation: Update
      apiVersion: argoproj.io/v1alpha1
      time: '2023-07-12T14:23:15Z'
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:annotations:
            .: {}
            f:cronworkflows.argoproj.io/last-used-schedule: {}
        f:status: {}
spec:
  workflowSpec:
    templates:
      - name: argosay
        inputs:
          parameters:
            - name: message
              value: '{{workflow.parameters.message}}'
        outputs: {}
        metadata: {}
        container:
          name: main
          image: argoproj/argosay:v2
          command:
            - /argosay
          args:
            - echo
            - '{{inputs.parameters.message}}'
          resources: {}
          volumeMounts:
            - name: pyspark-dir
              mountPath: /tmp/pyspark
    entrypoint: argosay
    arguments:
      parameters:
        - name: message
          value: hello argo
    ttlStrategy:
      secondsAfterCompletion: 300
    podGC:
      strategy: OnPodCompletion
  schedule: 0 11 * * *
  workflowMetadata:
    creationTimestamp: null
    labels:
      example: 'true'
status:
  active: null
  lastScheduledTime: '2023-07-12T14:20:00Z'
  conditions: null

Logs from the workflow controller

│ time="2023-07-12T13:52:12.976Z" level=info msg="Get leases 200"                                                                                                                                                                                                                                                                                                  
│ time="2023-07-12T13:52:12.983Z" level=info msg="Update leases 200"                                                                                                                                                                                                                                                                                               
│ time="2023-07-12T13:52:14.727Z" level=info msg="Processing workflow" namespace=app workflow=process-frc-v2-cron-n95vq                                                                                                                                                                                                                                            
│ time="2023-07-12T13:52:14.728Z" level=info msg="Updated phase  -> Error" namespace=app workflow=process-frc-v2-cron-n95vq                                                                                                                                                                                                                                       
│ time="2023-07-12T13:52:14.728Z" level=info msg="Updated message  -> map: map[metadata:map[creationTimestamp:<nil> name:pyspark-dir] spec:map[accessModes:[ReadWriteOnce] resources:map[requests:map[storage:350Gi]] storageClassName:xry-storage] status:map[]] does not contain declared merge key: name" namespace=app workflow=process-frc-v2-cron-n95vq      │
│ time="2023-07-12T13:52:14.728Z" level=info msg="Marking workflow completed" namespace=app workflow=process-frc-v2-cron-n95vq                                                                                                                                                                                                                                     
│ time="2023-07-12T13:52:14.728Z" level=error msg="Unable to set ExecWorkflow" error="map: map[metadata:map[creationTimestamp:<nil> name:pyspark-dir] spec:map[accessModes:[ReadWriteOnce] resources:map[requests:map[storage:350Gi]] storageClassName:xry-storage] status:map[]] does not contain declared merge key: name" namespace=app workflow=process-frc-v2 │
│ time="2023-07-12T13:52:14.728Z" level=info msg="Checking daemoned children of " namespace=app workflow=process-frc-v2-cron-n95vq                                                                                                                                                                                                                                 
│ time="2023-07-12T13:52:14.728Z" level=info msg="Workflow to be dehydrated" Workflow Size=6699                                                                                                                                                                                                                                                                    
│ time="2023-07-12T13:52:14.734Z" level=info msg="cleaning up pod" action=deletePod key=app/process-frc-v2-cron-n95vq-1340600742-agent/deletePod

Logs from in your workflow’s wait container

Containers are not started. Error comes from the controller before the containers are started.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 7
  • Comments: 20 (10 by maintainers)

Most upvoted comments

It will be in future patch release 3.5.1

Thanks @sunyeongchoi I haven’t see your PR! Waiting for it to be merged 😃 Would this solve all related issues? or just CronWorkflow one?

This PR will solve all related issue - Workflows, Cron Workflows 😃

I have a question about testing TestAddingWorkflowDefaultVolumeClaimTemplate. I tried to revert commit and reproduce the error caused by patchMergeKey. I thought that workflow-controller-configmap.yaml sets VolumeClaimTemplates, but workflow.yaml does not set VolumeClaimTemplates. In TestAddingWorkflowDefaultVolumeClaimTemplate, this step will throw an error due to patchMergeKey err := controller.setWorkflowDefaults(workflow). But it turns out that this is not the case. Did I misunderstand something? image

@sunyeongchoi you are right, actually what was working is submitting a WorkflowTemplate. We haven’t tested directly with Workflows.

Hello, @wobrycki

I have tested the issue in order to find a solution. However, I encountered the same error not only with CronWorkflows but also with Workflows. Below is the YAML I tested.

# workflow-controller-configmap.yaml
workflowDefaults: |
    spec:
      volumeClaimTemplates:                 
      - metadata:
          name: workdir                     
        spec:
          accessModes: [ "ReadWriteOnce" ]
          resources:
            requests:
              storage: 1Mi                  
          storageClassName: local-path
# workflow yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: volumes-pvc-
spec:
  entrypoint: volumes-pvc-example
  templates:
  - name: volumes-pvc-example
    steps:
    - - name: generate
        template: whalesay
    - - name: print
        template: print-message

  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["echo generating message in volume; cowsay hello world | tee /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol

  - name: print-message
    container:
      image: alpine:latest
      command: [sh, -c]
      args: ["echo getting message from volume; find /mnt/vol; cat /mnt/vol/hello_world.txt"]
      # Mount workdir volume at /mnt/vol before invoking docker/whalesay
      volumeMounts:                     # same syntax as k8s Pod spec
      - name: workdir
        mountPath: /mnt/vol
❯ argo list
NAME                STATUS   AGE   DURATION   PRIORITY   MESSAGE
volumes-pvc-mbzgh   Error    1m    1s         0          map: map[metadata:map[creationTimestamp:<nil> name:workdir] spec:map[accessModes:[ReadWriteOnce] resources:map[requests:map[storage:1Mi]] storageClassName:local-path] status:map[]] does not contain declared merge key: name

When you mentioned that it works fine with Workflows in your previous comment, are you certain about that?

@sunyeongchoi absolutely amazing!!

Thanks @sunyeongchoi I haven’t see your PR! Waiting for it to be merged 😃 Would this solve all related issues? or just CronWorkflow one?

@sunyeongchoi any news?

I have the same issue with regular Workflows but also with (Cluster)WorkflowTemplate. It doesn’t work like described in the doc page here.

I configured workflowDefaults in the ConfigMap as follow:

apiVersion: v1
kind: ConfigMap
metadata:
  name: argo-workflows
data:
  # [...]
  workflowDefaults: |
      # [...]
      volumeClaimTemplates:
        - metadata:
            name: workdir
          spec:
            storageClassName: standard
            accessModes: [ "ReadWriteOnce" ]
            resources:
              requests:
                storage: 1Gi

When I try to override the volumeClaimTemplate in a ClusterWorkflowTemplate or in a simple Workflow I get the following error (in UI):

Error: map: map[metadata:map[creationTimestamp:<nil> name:workdir] spec:map[accessModes:[ReadWriteOnce] resources:map[requests:map[storage:3Gi]] storageClassName:standard] status:map[]] does not contain declared merge key: name

Here the ClusterWorkflowTemplates:

apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: test
spec:
  volumeClaimTemplates:   # Attempt to override the default
    - metadata:
        name: workdir
      spec:
        storageClassName: standard
        accessModes: [ ReadWriteOnce ]
        resources:
          requests:
            storage: 3Gi   # default set to 1Gi
  entrypoint: test
  arguments:
    parameters:
      - name: workdir_path
        value: /dummy
      - name: message
        value: hello dude!
  templates:
    - name: test
      inputs:
        parameters:
          - name: workdir_path
          - name: message
      dag:
        tasks:
          - name: argosay
            templateRef:
              name: test-argosay
              template: test-argosay
              clusterScope: true
            arguments:
              parameters:
                - name: workdir_path
                  value: "{{inputs.parameters.workdir_path}}"
                - name: message
                  value: "{{inputs.parameters.message}}"
---
apiVersion: argoproj.io/v1alpha1
kind: ClusterWorkflowTemplate
metadata:
  name: test-argosay
spec:
  templates:
    - name: test-argosay
      inputs:
        parameters:
          - name: workdir_path
          - name: message
      container:
        name: argosay
        image: argoproj/argosay:v2
        command:
          - "/argosay"
        args:
          - "echo"
          - "{{inputs.parameters.message}}"
        volumeMounts:
          - name: workdir
            mountPath: "{{inputs.parameters.workdir_path}}"

Here the Workflow:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-
  namespace: default
  labels:
    scope: test
  annotations:
    argo: workflows
spec:
  volumeClaimTemplates:   # Attempt to override the default
    - metadata:
        name: workdir
      spec:
        storageClassName: standard
        accessModes: [ ReadWriteOnce ]
        resources:
          requests:
            storage: 3Gi   # default set to 1Gi
  arguments:
    parameters:
      - name: workdir_path
        value: /test
      - name: message
        value: hello matteo!
  entrypoint: test
  workflowTemplateRef:
    name: test
    clusterScope: true