argo-workflows: Mutex not being released on step completion

Summary

When using a variable within the mutex name, Mutex’s aren’t released until workflow-controller is restarted. Within the example below, the workflow never completes and waits pending on the final unlock.

Additionally all other workflows aren’t able to continue from the initial mutex locked (from gen-number-list).

Restarting the workflow-controller brings the workflow to life and all locks are released.

Diagnostics

GKE - v1.17.12-gke.1504

Argo version: v2.12.3

Workflow: https://gist.github.com/davidcollom/9c6f7d9d1819fe922d21b3a69e561754#file-workflow-yaml

Logs: https://gist.github.com/davidcollom/9c6f7d9d1819fe922d21b3a69e561754#file-logs


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 6
  • Comments: 24 (24 by maintainers)

Most upvoted comments

@davidcollom Thanks for your update and help. PR is already in review. It will be released next upcoming release

@davidcollom I have a fix. But I couldn’t reproduce on my local. I build docker image docker pull sarabala1979/workflow-controller:mutex. Is it possible to try it on your env and let me the details and logs?

Let me verify that. I am able to reproduce locally. I am investigating

I will look