helm: `helm uninstall --wait` doesn't actually wait for pods to be terminated
I’m running helm uninstall --wait for two charts (first uninstalling redis client, then redis server). However, the --wait flag does not actually wait for pods to be properly terminated (eg. helm uninstall --wait returns and the pods are still in Terminating state). This results in “client” erroring with “server is gone” even though I do helm uninstall --wait client && helm uninstall --wait server (note: uninstalling server faster than uninstalling client).
Output of helm version:
version.BuildInfo{Version:"v3.7.2", GitCommit:"663a896f4a815053445eec4153677ddc24a0a361", GitTreeState:"clean", GoVersion:"go1.17.3"}
Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T17:56:19Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-0389ca3", GitCommit:"8a4e27b9d88142bbdd21b997b532eb6d493df6d2", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:46Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Cloud Provider/Platform (AKS, GKE, Minikube etc.):
AWS
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 15 (6 by maintainers)
Commits related to this issue
- Add option to support foreground cascade deletion Add --cascade option to helm uninstall as proposed in https://github.com/helm/helm/issues/10586 When this option is specified deletion towards kubern... — committed to MichaelMorrisEst/helm by MichaelMorrisEst 2 years ago
- Add option to support cascade deletion options Add --cascade=<background|foreground|orphan> option to helm uninstall Current behaviour is hardcoded to background Addresses issue: https://github.com/... — committed to MichaelMorrisEst/helm by MichaelMorrisEst 2 years ago
I disagree with @joejulian opinion regarding preserve the existing behavior. My view is: This behavior is a bug as its not working as everyone would expect and its also not working as defined “It removes all of the resources associated with the last release of the chart as well as the release history, freeing it up for future use. […] will wait until all the resources are deleted before returning”.
If assuming people rely on wrong behavior, every bugfix would be a breaking change. In this case here also nobody could even really rely on pods are still running, because even that contains a timing component and would often not work.
I don’t think any effort should invested in preserving the current misbehavior and enforce the same time everyone use a additional parameter to get a meaningful wait operation. You could call it also “–wait-really” then. Who would use the wait then not with that additional parameter? If you want to wait that all resources are deleted, you want exactly that, not that most are deleted.
Hi @joejulian I am interested in contributing a fix for this issue as per your suggestion. I will create a PR with the code changes. This is my first contribution to the helm project so if I am missing anything in terms of the process for contributing then please let me know
Due to this bug, the following scenario not working:
--waitThe PVC will not re-created since it is being terminating in step 3 which should be terminated and removed by using
--waitparameter in step 2.The final result is , new created pod will be in pending status forever since the PVC is missing.
Just like this Stack overflow issue described.
please correct me for my understanding if any wrong.