argo-workflows: failed to terminate or stop with PNS executor
Summary
When i start a workflow with a PNS executor im unable to stop or teminate it, for this i run a simple sleep task. argo version is 2.10.2 (also try with 2.9) and my container runtime is containerd
I think its due to the fact that the wait container is terminating:
time="2020-09-17T13:23:10.705Z" level=info msg="Starting Workflow Executor" version=v2.10.2 time="2020-09-17T13:23:10.709Z" level=info msg="Creating PNS executor (namespace: argo, pod: lovely-dog, pid: 6, hasOutputs: false)" time="2020-09-17T13:23:10.709Z" level=info msg="Executor (version: v2.10.2, build_date: 2020-09-14T17:38:28Z) initialized (pod: argo/lovely-dog) with template:\n{\"name\":\"dogprint\",\"arguments\":{},\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"main\",\"image\":\"us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1\",\"command\":[\"sh\",\"-c\"],\"args\":[\"sleep 200\"],\"resources\":{}},\"serviceAccountName\":\"argo\"}" time="2020-09-17T13:23:10.709Z" level=info msg="Waiting on main container" time="2020-09-17T13:23:10.866Z" level=info msg="main container started with container ID: cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b" time="2020-09-17T13:23:10.866Z" level=info msg="Starting annotations monitor" time="2020-09-17T13:23:10.868Z" level=info msg="Starting deadline monitor" time="2020-09-17T13:23:10.869Z" level=info msg="containerID cri-containerd-cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b mapped to pid 17" time="2020-09-17T13:23:10.869Z" level=warning msg="Ignoring wait failure: Failed to determine pid for containerID cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b: container may have exited too quickly. Process assumed to have completed" time="2020-09-17T13:23:10.869Z" level=info msg="Main container completed" time="2020-09-17T13:23:10.869Z" level=info msg="No Script output reference in workflow. Capturing script output ignored" time="2020-09-17T13:23:10.869Z" level=info msg="Capturing script exit code" time="2020-09-17T13:23:10.869Z" level=info msg="Getting exit code of cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b" time="2020-09-17T13:23:10.869Z" level=info msg="Annotations monitor stopped" time="2020-09-17T13:23:10.872Z" level=info msg="No output parameters" time="2020-09-17T13:23:10.872Z" level=info msg="No output artifacts" time="2020-09-17T13:23:10.872Z" level=info msg="Killing sidecars" time="2020-09-17T13:23:10.876Z" level=info msg="Alloc=5931 TotalAlloc=14287 Sys=70080 NumGC=4 Goroutines=9"
this log:
“Ignoring wait failure: Failed to determine pid for containerID cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b: container may have exited too quickly. Process assumed to have completed”
Any ideas?
Also i check with kubelet executor, in this case the wait container is ok, the kill command is forwarded, but pod is not killed (kill timeout
Thx sigfrid
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 18 (9 by maintainers)
Just tested the latest release: everything works as expected. Thank you @cy-zheng ! @alexec, feel free to close the issue.