argo-workflows: Step or dag workflows do not seem to release semaphore locks
Summary
What happened/what you expected to happen?
Running synchronization-tmpl-level.yam the locks are acquired, but are not released once the step is finished. The workflow keeps running
and the other steps are waiting with Message Waiting for default/configmap/workflow-synchronization/template lock. Lock status: 0/2
(same behavior with DAG). Synchronization works on workflow level.
Diagnostics
What version of Argo Workflows are you running? v2.10.2
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
annotations:
argo: workflows
creationTimestamp: "2020-09-16T06:59:40Z"
generateName: synchronization-tmpl-level-
generation: 8
labels:
workflows.argoproj.io/phase: Running
managedFields:
- apiVersion: argoproj.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:generateName: {}
f:spec:
.: {}
f:arguments: {}
f:entrypoint: {}
f:templates: {}
f:status:
.: {}
f:finishedAt: {}
manager: argo
operation: Update
time: "2020-09-16T06:59:40Z"
- apiVersion: argoproj.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:argo: {}
f:labels:
.: {}
f:workflows.argoproj.io/phase: {}
f:spec:
f:parallelism: {}
f:serviceAccountName: {}
f:ttlStrategy:
.: {}
f:secondsAfterCompletion: {}
f:secondsAfterFailure: {}
f:secondsAfterSuccess: {}
f:status:
f:nodes:
.: {}
f:synchronization-tmpl-level-xxgc2:
.: {}
f:children: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-327139691:
.: {}
f:boundaryID: {}
f:children: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-633772542:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-1878609776:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-2314512256:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:id: {}
f:message: {}
f:name: {}
f:phase: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-2913002658:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:hostNodeName: {}
f:id: {}
f:name: {}
f:outputs:
.: {}
f:artifacts: {}
f:exitCode: {}
f:phase: {}
f:resourcesDuration:
.: {}
f:cpu: {}
f:memory: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:synchronization-tmpl-level-xxgc2-3085788296:
.: {}
f:boundaryID: {}
f:displayName: {}
f:finishedAt: {}
f:hostNodeName: {}
f:id: {}
f:name: {}
f:outputs:
.: {}
f:artifacts: {}
f:exitCode: {}
f:phase: {}
f:resourcesDuration:
.: {}
f:cpu: {}
f:memory: {}
f:startedAt: {}
f:templateName: {}
f:templateScope: {}
f:type: {}
f:phase: {}
f:startedAt: {}
f:synchronization:
.: {}
f:semaphore:
.: {}
f:holding: {}
f:waiting: {}
manager: workflow-controller
operation: Update
time: "2020-09-16T07:00:01Z"
name: synchronization-tmpl-level-xxgc2
namespace: default
resourceVersion: "39383194"
selfLink: /apis/argoproj.io/v1alpha1/namespaces/default/workflows/synchronization-tmpl-level-xxgc2
uid: 1630345b-c478-4013-b4e7-1435c5ba901c
spec:
arguments: {}
entrypoint: synchronization-tmpl-level-example
parallelism: 3
serviceAccountName: argo
templates:
- arguments: {}
inputs: {}
metadata: {}
name: synchronization-tmpl-level-example
outputs: {}
steps:
- - arguments:
parameters:
- name: seconds
value: '{{item}}'
name: synchronization-acquire-lock
template: acquire-lock
withParam: '["1","2","3","4","5"]'
- arguments: {}
container:
args:
- sleep 10; echo acquired lock
command:
- sh
- -c
image: alpine:latest
name: ""
resources: {}
inputs: {}
metadata: {}
name: acquire-lock
outputs: {}
synchronization:
semaphore:
configMapKeyRef:
key: template
name: workflow-synchronization
ttlStrategy:
secondsAfterCompletion: 600
secondsAfterFailure: 43200
secondsAfterSuccess: 600
status:
finishedAt: null
nodes:
synchronization-tmpl-level-xxgc2:
children:
- synchronization-tmpl-level-xxgc2-327139691
displayName: synchronization-tmpl-level-xxgc2
finishedAt: null
id: synchronization-tmpl-level-xxgc2
name: synchronization-tmpl-level-xxgc2
phase: Running
startedAt: "2020-09-16T06:59:40Z"
templateName: synchronization-tmpl-level-example
templateScope: local/synchronization-tmpl-level-xxgc2
type: Steps
synchronization-tmpl-level-xxgc2-327139691:
boundaryID: synchronization-tmpl-level-xxgc2
children:
- synchronization-tmpl-level-xxgc2-3085788296
- synchronization-tmpl-level-xxgc2-2913002658
- synchronization-tmpl-level-xxgc2-1878609776
- synchronization-tmpl-level-xxgc2-633772542
- synchronization-tmpl-level-xxgc2-2314512256
displayName: '[0]'
finishedAt: null
id: synchronization-tmpl-level-xxgc2-327139691
name: synchronization-tmpl-level-xxgc2[0]
phase: Running
startedAt: "2020-09-16T06:59:40Z"
templateName: synchronization-tmpl-level-example
templateScope: local/synchronization-tmpl-level-xxgc2
type: StepGroup
synchronization-tmpl-level-xxgc2-633772542:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(3:4)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-633772542
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(3:4)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-1878609776:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(2:3)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-1878609776
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(2:3)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-2314512256:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(4:5)
finishedAt: null
id: synchronization-tmpl-level-xxgc2-2314512256
message: 'Waiting for default/configmap/workflow-synchronization/template
lock. Lock status: 0/2 '
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(4:5)
phase: Pending
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-2913002658:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(1:2)
finishedAt: "2020-09-16T06:59:56Z"
hostNodeName: eoc-gzs-pn02-vm
id: synchronization-tmpl-level-xxgc2-2913002658
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2)
outputs:
artifacts:
- archiveLogs: true
name: main-logs
s3:
accessKeySecret:
key: accesskey
name: artifact-s3-secret
bucket: gzs-workflow-artifacts
endpoint: artifact-minio-service:9000
insecure: true
key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658/main.log
secretKeySecret:
key: secretkey
name: artifact-s3-secret
exitCode: "0"
phase: Succeeded
resourcesDuration:
cpu: 23
memory: 23
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
synchronization-tmpl-level-xxgc2-3085788296:
boundaryID: synchronization-tmpl-level-xxgc2
displayName: synchronization-acquire-lock(0:1)
finishedAt: "2020-09-16T06:59:59Z"
hostNodeName: eoc-gzs-pn02-vm
id: synchronization-tmpl-level-xxgc2-3085788296
name: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1)
outputs:
artifacts:
- archiveLogs: true
name: main-logs
s3:
accessKeySecret:
key: accesskey
name: artifact-s3-secret
bucket: gzs-workflow-artifacts
endpoint: artifact-minio-service:9000
insecure: true
key: default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296/main.log
secretKeySecret:
key: secretkey
name: artifact-s3-secret
exitCode: "0"
phase: Succeeded
resourcesDuration:
cpu: 26
memory: 26
startedAt: "2020-09-16T06:59:40Z"
templateName: acquire-lock
templateScope: local/synchronization-tmpl-level-xxgc2
type: Pod
phase: Running
startedAt: "2020-09-16T06:59:40Z"
synchronization:
semaphore:
holding:
- holders:
- synchronization-tmpl-level-xxgc2-3085788296
- synchronization-tmpl-level-xxgc2-2913002658
semaphore: default/configmap/workflow-synchronization/template
waiting:
- holders:
- default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296
- default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658
semaphore: default/configmap/workflow-synchronization/template
Paste the logs from the workflow controller:
time="2020-09-16T06:59:40Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Updated phase -> Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Steps node synchronization-tmpl-level-xxgc2 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="StepGroup node synchronization-tmpl-level-xxgc2-327139691 initialized Running" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-3085788296 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-3085788296 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(0:1) (synchronization-tmpl-level-xxgc2-3085788296)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="default/configmap/workflow-synchronization/template acquired by default/synchronization-tmpl-level-xxgc2/synchronization-tmpl-level-xxgc2-2913002658 " semaphore=default/configmap/workflow-synchronization/template
time="2020-09-16T06:59:40Z" level=info msg="Node synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) acquired synchronization lock" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2913002658 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Created pod: synchronization-tmpl-level-xxgc2[0].synchronization-acquire-lock(1:2) (synchronization-tmpl-level-xxgc2-2913002658)" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-1878609776 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-633772542 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Pod node synchronization-tmpl-level-xxgc2-2314512256 initialized Pending" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow step group node synchronization-tmpl-level-xxgc2-327139691 not yet completed" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:40Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383030 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 message: ContainerCreating"
time="2020-09-16T06:59:41Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:41Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383043 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:42Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:47Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Pending -> Running"
time="2020-09-16T06:59:47Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:47Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383086 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:48Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Pending -> Running"
time="2020-09-16T06:59:51Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:51Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383104 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=info msg="workflow active pod spec parallelism reached 5/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:52Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:56Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-2913002658
time="2020-09-16T06:59:58Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-2913002658 outputs"
time="2020-09-16T06:59:58Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-2913002658 status Running -> Succeeded"
time="2020-09-16T06:59:58Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:58Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383134 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="insignificant pod change" key=default/synchronization-tmpl-level-xxgc2-3085788296
time="2020-09-16T06:59:59Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Setting node synchronization-tmpl-level-xxgc2-3085788296 outputs"
time="2020-09-16T06:59:59Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-2913002658 completed"
time="2020-09-16T06:59:59Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T06:59:59Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383142 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=info msg="workflow active pod spec parallelism reached 4/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:00Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Updating node synchronization-tmpl-level-xxgc2-3085788296 status Running -> Succeeded"
time="2020-09-16T07:00:01Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:01Z" level=info msg="Workflow update successful" namespace=default phase=Running resourceVersion=39383194 workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=info msg="Labeled pod default/synchronization-tmpl-level-xxgc2-3085788296 completed"
time="2020-09-16T07:00:02Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:02Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="Processing workflow" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=info msg="workflow active pod spec parallelism reached 3/3" namespace=default workflow=synchronization-tmpl-level-xxgc2
time="2020-09-16T07:00:22Z" level=error msg="error in entry template execution" error="Max parallelism reached" namespace=default workflow=synchronization-tmpl-level-xxgc2
Message from the maintainers:
Impacted by this bug? Give it a đź‘Ť. We prioritise the issues with the most đź‘Ť.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 7
- Comments: 16 (11 by maintainers)
Commits related to this issue
- fix(controller): Synchronization lock didn't release on DAG call flow Fixes #4046 (#4263) — committed to argoproj/argo-workflows by sarabala1979 4 years ago
- fix(controller): Synchronization lock didn't release on DAG call flow Fixes #4046 (#4263) — committed to argoproj/argo-workflows by sarabala1979 4 years ago
- fix(controller): Synchronization lock didn't release on DAG call flow Fixes #4046 (#4263) Signed-off-by: Alex Capras <alexcapras@gmail.com> — committed to alexcapras/argo by sarabala1979 4 years ago
Restarting the controller seems like the only way, unfortunately. Will try to get a fix out soon.
I am currently working on a fix and refactor of the code, as I’ve found multiple issues with it.
To be clear, this does not make semaphores unusable – it only makes semaphores unusable while using parallelism at the same time.
We use argo 2.7.1 and we also noticed this problem on our cluster, but we already removed parallelism in our workflow… A restart of the workflow-controller like said above does fix our issue.
We think that a deletion of a running workflow does not free the lock but we are not 100% sure of that… We are also using some workflowgc to delete a completed workflow 5 mins after completion so maybe It can cause some issues ?
Yup, this will be included as part of this bug fix
Found the bug, fixing