argo-workflows: Step hangs forever w/ retryStrategy added to templates (possibly related to Lifecycle hooks)
Summary
I compose my DAG based on re-usable templates. I’m able to run the workflow successfully without retryStrategy on the templates, but adding the lines
retryStrategy:
retryPolicy: OnError
limit: "3"
causes the DAG to spin forever.
See images Without retryStrategy
With retryStrategy
Diagnostics
Sample WF spec
name: LinuxJobBase
container:
command: ["bash"]
args:
- -c
- >-
ls
retryStrategy:
limit: '3'
retryPolicy: OnError
- dag:
name: DAG
tasks:
- name: Python2Compile
template: LinuxJobBase
.......
Note that there is also an exithandler added to each of the dag steps, via hooks
hooks:
exit:
arguments:
parameters:
- name: POD_NAME
value: '{{tasks.Python3UnitTests.outputs.parameters.podname}}'
- name: NODE_NAME
value: '{{tasks.Python3UnitTests.outputs.parameters.nodename}}'
- name: POD_NAMESPACE
value: '{{tasks.Python3UnitTests.outputs.parameters.podnamespace}}'
- name: POD_UID
value: '{{tasks.Python3UnitTests.outputs.parameters.poduid}}'
- name: IMAGE
value: '{{tasks.Python3UnitTests.outputs.parameters.image}}'
template: LinuxExitHandler
What Kubernetes provider are you using? Bare Metal What version of Argo Workflows are you running? 3.1.0 What executor are you running? Docker/K8SAPI/Kubelet/PNS/Emissary Emissary
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 2
- Comments: 17 (17 by maintainers)
@ad22 Thanks for the info and sorry for the delay… I’ll get a chance to look at this soon