pipelines: "Terminate" does not stop pipeline run execution
What steps did you take:
Hitting “Terminate” fails to actually stop pipeline execution if there are steps with retryStrategies or running init containers
What happened:
We are running a pipeline that has ±50 parallel step executions. If we want to stop pipeline run we have to delete Kubernetes Workflow resource rather than using “Terminate” UI button as it does not stop the run if:
- There are steps = containers with
retryStrategy, as they are restarting instead of being terminated - If there are any init containers, the step won’t be stopped until they finish and the main one takes over
What did you expect to happen:
All pipelines steps/pods are terminated the instant user hits “Terminate” with no retries, as if the corresponding Workflow is deleted
How did you deploy Kubeflow Pipelines (KFP)? Kubeflow Pipelines are installed as a part of Kubeflow project
KFP version: Build commit: 743746b KFP SDK version: 0.5.1
Anything else you would like to add:
Please let me know if additional info is required
/kind bug /area frontend /area backend
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 9
- Comments: 26 (13 by maintainers)
Creating a minimal pipeline would require a lot of GitHub archeology, recurrent scheduling - and would probably not provoke the error, since the Kubeflow cluster is now upgraded.
What worked in the end was to remove the row that described the pipeline run from the MySQL database. This screenshot shows the database:
In the table
run_details, delete the row with the same UUID as seen in the Kubeflow UI.Sure, code is in https://github.com/kubeflow/pipelines/blob/d6a2c23f56943ea8af35b5b2e5f6c6381bfb25ed/backend/src/apiserver/resource/resource_manager.go#L462