argo-workflows: failed to terminate or stop with PNS executor

Summary

When i start a workflow with a PNS executor im unable to stop or teminate it, for this i run a simple sleep task. argo version is 2.10.2 (also try with 2.9) and my container runtime is containerd

I think its due to the fact that the wait container is terminating:

time="2020-09-17T13:23:10.705Z" level=info msg="Starting Workflow Executor" version=v2.10.2 time="2020-09-17T13:23:10.709Z" level=info msg="Creating PNS executor (namespace: argo, pod: lovely-dog, pid: 6, hasOutputs: false)" time="2020-09-17T13:23:10.709Z" level=info msg="Executor (version: v2.10.2, build_date: 2020-09-14T17:38:28Z) initialized (pod: argo/lovely-dog) with template:\n{\"name\":\"dogprint\",\"arguments\":{},\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"main\",\"image\":\"us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1\",\"command\":[\"sh\",\"-c\"],\"args\":[\"sleep 200\"],\"resources\":{}},\"serviceAccountName\":\"argo\"}" time="2020-09-17T13:23:10.709Z" level=info msg="Waiting on main container" time="2020-09-17T13:23:10.866Z" level=info msg="main container started with container ID: cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b" time="2020-09-17T13:23:10.866Z" level=info msg="Starting annotations monitor" time="2020-09-17T13:23:10.868Z" level=info msg="Starting deadline monitor" time="2020-09-17T13:23:10.869Z" level=info msg="containerID cri-containerd-cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b mapped to pid 17" time="2020-09-17T13:23:10.869Z" level=warning msg="Ignoring wait failure: Failed to determine pid for containerID cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b: container may have exited too quickly. Process assumed to have completed" time="2020-09-17T13:23:10.869Z" level=info msg="Main container completed" time="2020-09-17T13:23:10.869Z" level=info msg="No Script output reference in workflow. Capturing script output ignored" time="2020-09-17T13:23:10.869Z" level=info msg="Capturing script exit code" time="2020-09-17T13:23:10.869Z" level=info msg="Getting exit code of cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b" time="2020-09-17T13:23:10.869Z" level=info msg="Annotations monitor stopped" time="2020-09-17T13:23:10.872Z" level=info msg="No output parameters" time="2020-09-17T13:23:10.872Z" level=info msg="No output artifacts" time="2020-09-17T13:23:10.872Z" level=info msg="Killing sidecars" time="2020-09-17T13:23:10.876Z" level=info msg="Alloc=5931 TotalAlloc=14287 Sys=70080 NumGC=4 Goroutines=9"

this log:

“Ignoring wait failure: Failed to determine pid for containerID cad3533a770168a7dbdbb8387e98863ff8cf294b11891934b92be44843ea305b: container may have exited too quickly. Process assumed to have completed”

Any ideas?

Also i check with kubelet executor, in this case the wait container is ok, the kill command is forwarded, but pod is not killed (kill timeout

Thx sigfrid

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 18 (9 by maintainers)

Most upvoted comments

Just tested the latest release: everything works as expected. Thank you @cy-zheng ! @alexec, feel free to close the issue.