argo-workflows: Step or dag workflows do not seem to release semaphore locks

Summary

What happened/what you expected to happen? Running synchronization-tmpl-level.yam the locks are acquired, but are not released once the step is finished. The workflow keeps running and the other steps are waiting with Message Waiting for default/configmap/workflow-synchronization/template lock. Lock status: 0/2 (same behavior with DAG). Synchronization works on workflow level.

Diagnostics

What version of Argo Workflows are you running? v2.10.2

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    argo: workflows
  creationTimestamp: "2020-09-16T06:59:40Z"
  generateName: synchronization-tmpl-level-
  generation: 8
  labels:
    workflows.argoproj.io/phase: Running
  managedFields:
  - apiVersion: argoproj.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName: {}
      f:spec:
        .: {}
        f:arguments: {}
        f:entrypoint: {}
        f:templates: {}
      f:status:
        .: {}
        f:finishedAt: {}
    manager: argo
    operation: Update
    time: "2020-09-16T06:59:40Z"
  - apiVersion: argoproj.io/v1alpha1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:argo: {}
        f:labels:
          .: {}
          f:workflows.argoproj.io/phase: {}
      f:spec:
        f:parallelism: {}
        f:serviceAccountName: {}
        f:ttlStrategy:
          .: {}
          f:secondsAfterCompletion: {}
          f:secondsAfterFailure: {}
          f:secondsAfterSuccess: {}
      f:status:
        f:nodes:
          .: {}
          f:synchronization-tmpl-level-xxgc2:
            .: {}
            f:children: {}
            f:displayName: {}
            f:finishedAt: {}
            f:id: {}
            f:name: {}
            f:phase: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-327139691:
            .: {}
            f:boundaryID: {}
            f:children: {}
            f:displayName: {}
            f:finishedAt: {}
            f:id: {}
            f:name: {}
            f:phase: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-633772542:
            .: {}
            f:boundaryID: {}
            f:displayName: {}
            f:finishedAt: {}
            f:id: {}
            f:message: {}
            f:name: {}
            f:phase: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-1878609776:
            .: {}
            f:boundaryID: {}
            f:displayName: {}
            f:finishedAt: {}
            f:id: {}
            f:message: {}
            f:name: {}
            f:phase: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-2314512256:
            .: {}
            f:boundaryID: {}
            f:displayName: {}
            f:finishedAt: {}
            f:id: {}
            f:message: {}
            f:name: {}
            f:phase: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-2913002658:
            .: {}
            f:boundaryID: {}
            f:displayName: {}
            f:finishedAt: {}
            f:hostNodeName: {}
            f:id: {}
            f:name: {}
            f:outputs:
              .: {}
              f:artifacts: {}
              f:exitCode: {}
            f:phase: {}
            f:resourcesDuration:
              .: {}
              f:cpu: {}
              f:memory: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
          f:synchronization-tmpl-level-xxgc2-3085788296:
            .: {}
            f:boundaryID: {}
            f:displayName: {}
            f:finishedAt: {}
            f:hostNodeName: {}
            f:id: {}
            f:name: {}
            f:outputs:
              .: {}
              f:artifacts: {}
              f:exitCode: {}
            f:phase: {}
            f:resourcesDuration:
              .: {}
              f:cpu: {}
              f:memory: {}
            f:startedAt: {}
            f:templateName: {}
            f:templateScope: {}
            f:type: {}
        f:phase: {}
        f:startedAt: {}
        f:synchronization:
          .: {}
          f:semaphore:
            .: {}
            f:holding: {}
            f:waiting: {}
    manager: workflow-controller
    operation: Update
    time: "2020-09-16T07:00:01Z"
  name: synchronization-tmpl-level-xxgc2
  namespace: default
  resourceVersion: "39383194"
  selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/workflows/synchronization-tmpl-level-xxgc2
  uid: 1630345b-c478-4013-b4e7-1435c5ba901c
spec:
  arguments: {}
  entrypoint: synchronization-tmpl-level-example
  parallelism: 3
  serviceAccountName: argo
  templates:
  - arguments: {}
    inputs: {}
    metadata: {}
    name: synchronization-tmpl-level-example
    outputs: {}
    steps:
    - - arguments:
          parameters:
          - name: seconds
            value: '{{item}}'
        name: synchronization-acquire-lock
        template: acquire-lock
        withParam: '["1","2","3","4","5"]'
  - arguments: {}
    container:
      args:
      - sleep 10; echo acquired lock
      command:
      - sh
      - -c
      image: alpine:latest
      name: ""
      resources: {}
    inputs: {}
    metadata: {}
    name: acquire-lock
    outputs: {}
    synchronization:
      semaphore:
        configMapKeyRef:
          key: template
          name: workflow-synchronization
  ttlStrategy:
    secondsAfterCompletion: 600
    secondsAfterFailure: 43200
    secondsAfterSuccess: 600
status:
  finishedAt: null
  nodes:
    synchronization-tmpl-level-xxgc2:
      children:
      - synchronization-tmpl-level-xxgc2-327139691
      displayName: synchronization-tmpl-level-xxgc2
      finishedAt: null
      id: synchronization-tmpl-level-xxgc2
      name: synchronization-tmpl-level-xxgc2
      phase: Running
      startedAt: "2020-09-16T06:59:40Z"
      templateName: synchronization-tmpl-level-example
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Steps
    synchronization-tmpl-level-xxgc2-327139691:
      boundaryID: synchronization-tmpl-level-xxgc2
      children:
      - synchronization-tmpl-level-xxgc2-3085788296
      - synchronization-tmpl-level-xxgc2-2913002658
      - synchronization-tmpl-level-xxgc2-1878609776
      - synchronization-tmpl-level-xxgc2-633772542
      - synchronization-tmpl-level-xxgc2-2314512256
      displayName: '[0]'
      finishedAt: null
      id: synchronization-tmpl-level-xxgc2-327139691
      name: synchronization-tmpl-level-xxgc2[0]
      phase: Running
      startedAt: "2020-09-16T06:59:40Z"
      templateName: synchronization-tmpl-level-example
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: StepGroup
    synchronization-tmpl-level-xxgc2-633772542:
      boundaryID: synchronization-tmpl-level-xxgc2
      displayName: synchronization-acquire-lock(3:4)
      finishedAt: null
      id: synchronization-tmpl-level-xxgc2-633772542
      message: 'Waiting for default/configmap/workflow-synchronization/template
        lock. Lock status: 0/2 '
      name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(3:4)
      phase: Pending
      startedAt: "2020-09-16T06:59:40Z"
      templateName: acquire-lock
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Pod
    synchronization-tmpl-level-xxgc2-1878609776:
      boundaryID: synchronization-tmpl-level-xxgc2
      displayName: synchronization-acquire-lock(2:3)
      finishedAt: null
      id: synchronization-tmpl-level-xxgc2-1878609776
      message: 'Waiting for default/configmap/workflow-synchronization/template
        lock. Lock status: 0/2 '
      name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(2:3)
      phase: Pending
      startedAt: "2020-09-16T06:59:40Z"
      templateName: acquire-lock
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Pod
    synchronization-tmpl-level-xxgc2-2314512256:
      boundaryID: synchronization-tmpl-level-xxgc2
      displayName: synchronization-acquire-lock(4:5)
      finishedAt: null
      id: synchronization-tmpl-level-xxgc2-2314512256
      message: 'Waiting for default/configmap/workflow-synchronization/template
        lock. Lock status: 0/2 '
      name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(4:5)
      phase: Pending
      startedAt: "2020-09-16T06:59:40Z"
      templateName: acquire-lock
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Pod
    synchronization-tmpl-level-xxgc2-2913002658:
      boundaryID: synchronization-tmpl-level-xxgc2
      displayName: synchronization-acquire-lock(1:2)
      finishedAt: "2020-09-16T06:59:56Z"
      hostNodeName: eoc-gzs-pn02-vm
      id: synchronization-tmpl-level-xxgc2-2913002658
      name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2)
      outputs:
        artifacts:
        - archiveLogs: true
          name: main-logs
          s3:
            accessKeySecret:
              key: accesskey
              name: artifact-s3-secret
            bucket: gzs-workflow-artifacts
            endpoint: artifact-minio-service:9000
            insecure: true
            key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658/main.log
            secretKeySecret:
              key: secretkey
              name: artifact-s3-secret
        exitCode: "0"
      phase: Succeeded
      resourcesDuration:
        cpu: 23
        memory: 23
      startedAt: "2020-09-16T06:59:40Z"
      templateName: acquire-lock
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Pod
    synchronization-tmpl-level-xxgc2-3085788296:
      boundaryID: synchronization-tmpl-level-xxgc2
      displayName: synchronization-acquire-lock(0:1)
      finishedAt: "2020-09-16T06:59:59Z"
      hostNodeName: eoc-gzs-pn02-vm
      id: synchronization-tmpl-level-xxgc2-3085788296
      name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1)
      outputs:
        artifacts:
        - archiveLogs: true
          name: main-logs
          s3:
            accessKeySecret:
              key: accesskey
              name: artifact-s3-secret
            bucket: gzs-workflow-artifacts
            endpoint: artifact-minio-service:9000
            insecure: true
            key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296/main.log
            secretKeySecret:
              key: secretkey
              name: artifact-s3-secret
        exitCode: "0"
      phase: Succeeded
      resourcesDuration:
        cpu: 26
        memory: 26
      startedAt: "2020-09-16T06:59:40Z"
      templateName: acquire-lock
      templateScope: local/synchronization-tmpl-level-xxgc2
      type: Pod
  phase: Running
  startedAt: "2020-09-16T06:59:40Z"
  synchronization:
    semaphore:
      holding:
      - holders:
        - synchronization-tmpl-level-xxgc2-3085788296
        - synchronization-tmpl-level-xxgc2-2913002658
        semaphore: default/configmap/workflow-synchronization/template
      waiting:
      - holders:
        - default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296
        - default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658
        semaphore: default/configmap/workflow-synchronization/template
Paste the logs from the workflow controller:
time="2020-09-16T06:59:40Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Updated phase  -> Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Steps node synchronization-tmpl-level-xxgc2 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="StepGroup node synchronization-tmpl-level-xxgc2-327139691 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-3085788296 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) (synchronization-tmpl-level-xxgc2-3085788296)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2913002658 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) (synchronization-tmpl-level-xxgc2-2913002658)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-1878609776 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-633772542 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2314512256 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow step group node synchronization-tmpl-level-xxgc2-327139691 not yet completed" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383030 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383043 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:47Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Pending -> Running"
time="2020-09-16T06:59:47Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383086 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Pending -> Running"
time="2020-09-16T06:59:51Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383104 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:56Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:58Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-2913002658 outputs"
time="2020-09-16T06:59:58Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Running -> Succeeded"
time="2020-09-16T06:59:58Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383134 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:59Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-3085788296 outputs"
time="2020-09-16T06:59:59Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-2913002658 completed"
time="2020-09-16T06:59:59Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383142 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Running -> Succeeded"
time="2020-09-16T07:00:01Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383194 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-3085788296 completed"
time="2020-09-16T07:00:02Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2

Message from the maintainers:

Impacted by this bug? Give it a đź‘Ť. We prioritise the issues with the most đź‘Ť.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 7
  • Comments: 16 (11 by maintainers)

Commits related to this issue

Most upvoted comments

@simster7 Any workaround for the lock acquired during workflow issue. Is there a way to manually reset the lock?

Restarting the controller seems like the only way, unfortunately. Will try to get a fix out soon.

@simster7 @sarabala1979 this looks like an issue that makes semaphores unusable - how can we quickly get this fixed, back-ported and released?

I am currently working on a fix and refactor of the code, as I’ve found multiple issues with it.

To be clear, this does not make semaphores unusable – it only makes semaphores unusable while using parallelism at the same time.

We use argo 2.7.1 and we also noticed this problem on our cluster, but we already removed parallelism in our workflow… A restart of the workflow-controller like said above does fix our issue.

We think that a deletion of a running workflow does not free the lock but we are not 100% sure of that… We are also using some workflowgc to delete a completed workflow 5 mins after completion so maybe It can cause some issues ?

We’ve also seen this issue when a pod that has acquired the lock and is running, but the workflow gets deleted during this phase. The lock is never released.

Yup, this will be included as part of this bug fix

Found the bug, fixing