argo-workflows: Race conditions when persisting Workflows causing undefined behavior

Users are reporting

Fabio Rigato: “stuck with deleted pods”

  • v2.9.2 workflows can get stuck with deleted pods: Error from server (NotFound): pods "..." not found". There are #3097 and #3469 in v2.9.3 that could fix this.

Prateek Khera: “deadline exceeded”

  • v2.9.3 workflows stuck in progressing `level=warning msg=“Deadline exceeded” namespace=XXXX workflow=XXX level=error msg=“error in entry template execution” error=“Deadline exceeded” namespace=ci workflow=XXX

Jonathan Steele: “stuck workflow parent task”

I upgraded to 2.9.4 and am now experiencing “stuck” workflows where all of the tasks are complete, but the parent task is still running

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 2
  • Comments: 15 (15 by maintainers)

Most upvoted comments

Thanks @simster7, I created a new issue (#3857) Cheers, Fabio

Thanks for the detailed reports @fabio-rigato