helm: `helm upgrade --install` doesn't perform an install/upgrade if the first ever install fails
Using helm upgrade --install is a nice way to install or upgrade depending on if the release exists. But it looks like there’s a bug in the logic; its not handling failed installs. In my case the first install failed; then a subsequent attempt wasn’t even made as it crashes out immediately.
Maybe if the last release failed then helm upgrade --install should delete it and install again?
$ helm list
NAME REVISION UPDATED STATUS CHART NAMESPACE
foo 2 Wed Jan 17 11:48:08 2018 FAILED something-0.0.1 default
$ helm upgrade "foo" . --install
Error: UPGRADE FAILED: "foo" has no deployed releases
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 86
- Comments: 33 (13 by maintainers)
Links to this issue
Commits related to this issue
- fix upgrade of broken install If the first `upgrade --install` results in a state FAILED, you can not run the same command `upgrade --install` again without a failure. This happens becuase we are sea... — committed to sorenmat/helm by deleted user 6 years ago
- fix upgrade of broken install If the first `upgrade --install` results in a state FAILED, you can not run the same command `upgrade --install` again without a failure. This happens becuase we are sea... — committed to sorenmat/helm by deleted user 6 years ago
The suggested fix seems completely untenable in an automated system. I definitely don’t want everything invoking helm to have to know about “if first release fails, delete and retry”. For one, most of my tooling isn’t aware if it’s an install or upgrade, or if it’s the first time or 100th time, it’s almost always just running
helm upgrade --install.I am not sure why we need the
installandupgradecommands, I only ever use theupgrade --installcommand and is seems like a lot of people do the same. I just need one command that doesupgrade --installand doesn’t trip over a failed run. Can we just renameupgrade --installtodeploy, make it truly idempotent, and ditch the other two?(I’m struggling with a new variant this problem behavior in 2.8.0. Since upgrading from 2.7.2 now if I have a failed install, and then
delete --purgeit, and theupgrade --installit, I can still get theError: UPGRADE FAILED: "xyz" has no deployed releaseserror. Seems like--purgeisn’t full effective in 2.8.0 and tiller has some stuck state not showing inlist --all. I have to then to ainstallto get tiller back to a state where I can do the usualupgrade --installagain.)This was intentional by design of https://github.com/kubernetes/helm/pull/3097. Basically, diffing against a failed deployment caused undesirable behaviour, most notably this long list of bugs:
If your initial release ends up in a failed state, we recommend purging the release via
helm delete --purge fooand trying again. After a successful initial release, any subsequent failed releases will be ignored, and helm will do a diff against the last known successful release.Now that being said, it might be valuable to not perform a diff when no successful releases have been deployed. The experience would be the same as if the user ran
helm installfor the very first time in the sense that there would be no “current” release to diff against. I’d be a little concerned about certain edge cases though. @adamreese do you have any opinions on this one?Thanks @bacongobbler. Ok, I understand that list is what is meant by the ledger. And if you still have the ledger, that you can use
helm get manifest --revision 123to see what was deployed that failed? That is certainly useful to preserve. And if werollbackwe don’t lose that information.If we had
helm upgrade --install --auto-rollbackthen both the failed deployment the rollback would be recorded in the ledger and available to operators. And that would go a long way to preventing CI deployments getting to the intractable ‘failed’ state where ‘helm upgrade --install’ stops working. Failed CI deployments are usually developers injecting typos/mistakes into the deployment system. With ‘–auto-rollback’ They can inspect thehelmcommand error message retained in the deployment server log, fix and deployed corrected values.I guess even without the ‘–auto-rollback’ option we could use a wrapper automate to run
helm rollbackany timehelm update --installreturns an ‘FAILED’ error. And maybe detect where is it the initial install, andhelm delete --purgeinstead in those cases.That is, we could fashion a wrapper script to ensure the results of a CI ‘helm upgrade --install’ is always a state where the next CI ‘helm upgrade --install’ will always be possible. Whilst retaining the ledger information for any failed attempts (at least for releases whose initial install worked).
helm deploy=helm upgrade --installhelm delete --purgehelm rollbackWith helm currently, you don’t know which helm command+options combination to use without inspecting the current state. And for a given helm command you don’t know what you are going to get, because it depends on what the current state is. That’s not really the declarative desired state dream ☁️ 💤 😄
In helm 3 we can potentially deprecate
install/upgrade/--replace/--upgrade/--forceand replace them all with an idempotenthelm deploythat either achieves the desired state, or leaves the state unchanged. Maybe using an algorithm similar to above, which ifhelm deployfails, rolls back (revision > 1) or deletes+purges (revision = 1), to leave the state as it was before. The failed manifest would still be available viahelm history/get. And there could even be a ‘–no-rollback’ option for people who want to preserve the deployment in a failed state for investigationThe option of
helm upgrade --install --forceis getting close, except that rather than rolling back and upgrading, it deletes and replaces failed releases (even for revisions >1), which makes some people angry over on #3208… 😮 ⚡️ 💥For right now we can use wrapper scripts or meta-tools like
helmsmanwhose feature list is in part to employhelmbut mitigate this issue:I agree with @whereisaaron, I would be nice with a
deploycommand that worked more likekubectl apply. Makes automation of Helm much easier too, since you don’t have to check for releases existing in some shell script madness 😃In retrospect, this is a breathtakingly obvious design goal.
Perhaps the solution is to have helm automatically run
helm delete --purge? Something like:helm upgrade --installhelm upgrade --installhelm delete --purgehelm upgrade --installand shows output from thatPerhaps this behavior could be triggered via the
--forceflag which already has similar behavior for other scenariosGood idea, but I don’t think we should ever delete the release ledger without the user explicitly asking to remove that data. Operators of Helm will want to learn why the service failed to upgrade from previously failed releases, or deduce failures by collecting that data from the ledger.
I provided a comment earlier in the thread that describes a solution to the issue. It’s similar to your solution in execution, but without the need to delete the entire release ledger. I believe #3437 is attempting to apply that solution as a patch.
Hi, I’ve created a PR https://github.com/kubernetes/helm/pull/3437 that should fix this issue
Hi, In our case the initial release did not really fail… It’s just either that our application was not completely up when the install timeout elapsed or some other strange issue that was fixed. In any case, the application is running perfectly fine, and thus having to delete it would be a problem for us (we have some persistent storage attached that would be also removed!!) .
Is there any workaround to deploy a chart when the initial release ‘apparently failed’ but it’s actually ok?
Hello there,
I’m using Helm 2.12.2 and still have the issue, that helm fails, when the first deployment is failed. Is this a regression maybe?
https://github.com/helm/helm/issues/3353#issuecomment-362497951
and
https://github.com/helm/helm/issues/3353#issuecomment-469109854
I generally agree that helm should work like
kubectl applyand attempt to achieve the desired reality rather than needing to run different types of commands depending on the state of your cluster. Was hoping to add support to a dedicated issue if one existed or at least figure out what the resolution was sincedeployis not currently implemented and we’re on helm 3.2.@RickS-C137 I think this is supposed to be fixed by using
helm upgrade --install --forcewhich will ‘delete’ then ‘install --replace’ a failed release.My local development would go much smoother if I could make
helm upgrade -ibe idempotent even against Failed releases for at least some combination of arguments. My use case is when I have a script of many releases that I know I want to get up to start a local development env.This might be analogous to the
--replaceflag forhelm install. Note that--replaceis one of only two flags fromhelm installthat is missing inhelm upgrade, the other being--name-template.@hickeyma done https://github.com/helm/helm/issues/8415!
I’d also like to call out that I commented on the original PR https://github.com/kubernetes/helm/pull/3097#discussion_r151808599 asking specifically about this case.
I implemented this in our build:
@whereisaaron that would be elegant 👍