argo-workflows: Mutex not being released on step completion
Summary
When using a variable within the mutex name, Mutex’s aren’t released until workflow-controller is restarted. Within the example below, the workflow never completes and waits pending on the final unlock.
Additionally all other workflows aren’t able to continue from the initial mutex locked (from gen-number-list
).
Restarting the workflow-controller brings the workflow to life and all locks are released.
Diagnostics
GKE - v1.17.12-gke.1504
Argo version: v2.12.3
Workflow: https://gist.github.com/davidcollom/9c6f7d9d1819fe922d21b3a69e561754#file-workflow-yaml
Logs: https://gist.github.com/davidcollom/9c6f7d9d1819fe922d21b3a69e561754#file-logs
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 6
- Comments: 24 (24 by maintainers)
@davidcollom Thanks for your update and help. PR is already in review. It will be released next upcoming release
@davidcollom I have a fix. But I couldn’t reproduce on my local. I build docker image
docker pull sarabala1979/workflow-controller:mutex
. Is it possible to try it on your env and let me the details and logs?Let me verify that. I am able to reproduce locally. I am investigating
I will look