helm: Helm v3.4 Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

Upgraded from Helm 3.3 to Helm 3.4, existing charts started failing the upgrade with the message: Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

At the same time a helm list -n myns the chart disappeared and didn’t show up in the list at all

This is a chart that’s been upgraded over 800 times successfully, only change was the helm version bump, chart failed twice in an attempt to deploy with the command:

helm upgrade --install --namespace myns --timeout 1800s --atomic mychart charts/app/standalone --values values-override.yaml

Once I rolled back to 3.3 I was able to upgrade the chart successfully.

Output of helm version:

version.BuildInfo{Version:"v3.4.0", GitCommit:"7090a89efc8a18f3d8178bf47d2462450349a004", GitTreeState:"dirty", GoVersion:"go1.15.3"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.13", GitCommit:"39a145ca3413079bcb9c80846488786fed5fe1cb", GitTreeState:"clean", BuildDate:"2020-07-15T16:18:19Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.401", GitCommit:"eb94c181eea5290e9da1238db02cfef263542f5f", GitTreeState:"clean", BuildDate:"2020-09-09T00:57:35Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}

Cloud Provider/Platform (AKS, GKE, Minikube etc.): GKE

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 33
Comments: 51 (8 by maintainers)

Commits related to this issue

[v3.4.0][helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which ... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
surfliner: roll helm-kubectl image back to 3.3.4 Due to recurring instances of https://github.com/helm/helm/issues/8987 — committed to surfliner/surfliner-mirror by mcritchlow 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
[helm for werf] Remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which could no... — committed to werf/3p-helm by distorhead 4 years ago
feat(helm for werf): remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which cou... — committed to werf/3p-helm by distorhead 4 years ago
feat(helm for werf): remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which cou... — committed to werf/3p-helm by distorhead 4 years ago
feat(helm for werf): remove errPending on upgrade errPending on upgrade could occur when some deploy has been interrupted by a signal. In this case helm leave release in inconsistent state which cou... — committed to werf/3p-helm by distorhead 4 years ago

Most upvoted comments

Hi @bacongobbler, the described workaround did indeed works!

$ helm history kyc-api
REVISION	UPDATED                 	STATUS         	CHART             	APP VERSION	DESCRIPTION             
1       	Mon Nov  9 14:57:36 2020	pending-install	generic-base-0.2.1	0.1.0      	Initial install underway

$ helm rollback kyc-api 1
Rollback was a success! Happy Helming!

 $ helm history kyc-api
REVISION	UPDATED                 	STATUS         	CHART             	APP VERSION	DESCRIPTION             
1       	Mon Nov  9 14:57:36 2020	pending-install	generic-base-0.2.1	0.1.0      	Initial install underway
2       	Mon Nov  9 15:06:15 2020	deployed       	generic-base-0.2.1	0.1.0      	Rollback to 1

Looking at the werf/helm PR pretty much confirms that CTRL+C breaks the helm installation on 3.4.0.

+265

Skaronator on Nov 9, 2020

Experienced this on Helm v3.5.2 caused by CTRL+C pressed during upgrade. Workaround: kubectl delete secret sh.helm.release.v1.<RELEASE_NAME>.v<LATEST_REVISION>

+67

koletyst on Feb 25, 2021

I also have the same issue described on this thread with v3.4.1

helm rollback can be a workaround for development machines, but unacceptable on production CI/CD pipelines

When the helm CLI receives a SIGTERM signal, it should exit gracefully leaving helm labels in a stable state, allowing further deployments without issues

The issue is not fixed yet and should be open again for further research

+55

jorgeclaro on Dec 9, 2020

the problem is that helm rollback && helm upgrade is not a suitable solution for production deployments

+35

ahmadalli on Feb 7, 2021

I can reproduce with helm upgrade --install --atomic and interrupting it during the execution. The second run will always return an error:

Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

Can be solved by:

helm rollback

I guess this proposal could solve this issue https://github.com/helm/helm/issues/8040

+22

kvaps on Dec 9, 2020

same on v3.5.4, fixed by

kubectl delete secret sh.helm.release.v1.asdf.v1 -n asdf

+21

okunc on Sep 6, 2021

I have the same issues. I’ll try to get a debug output if possible.

I noticed this issue usually happens when you upgrade a helm chart with --wait and the upgrade clearly fails (like a crashloopbackoff or something like that) and helm is waiting until it reaches the timeout but the user does CTRL+C before reaching the timeout. After that I’ll get the same error as posted above:

STDERR:
  Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress

I’m using helmfile, not helm directly, maybe its another problem with helmfile not sending the SIGTERM correctly.

+18

Skaronator on Nov 6, 2020

Run ‘helm history —all’, job is probably pending, you’ll have to rollback to last successful deployment.

On Nov 29, 2020, at 5:39 PM, Victor Login notifications@github.com wrote:

my out:

helm upgrade shortlink-api ops/Helm/shortlink-api --install --wait --namespace=shortlink --set deploy.image.tag=0.7.0.16 --debug -v 6

history.go:53: [debug] getting history for release shortlink-api upgrade.go:121: [debug] preparing upgrade for shortlink-api Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress helm.go:81: [debug] another operation (install/upgrade/rollback) is in progress helm.sh/helm/v3/pkg/action.init /home/circleci/helm.sh/helm/pkg/action/action.go:62 runtime.doInit /usr/local/go/src/runtime/proc.go:5474 runtime.doInit /usr/local/go/src/runtime/proc.go:5469 runtime.main /usr/local/go/src/runtime/proc.go:190 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1373 UPGRADE FAILED main.newUpgradeCmd.func2 /home/circleci/helm.sh/helm/cmd/helm/upgrade.go:156 github.com/spf13/cobra.(*Command).execute /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:842 github.com/spf13/cobra.(*Command).ExecuteC /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950 github.com/spf13/cobra.(*Command).Execute /go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887 main.main /home/circleci/helm.sh/helm/cmd/helm/helm.go:80 runtime.main /usr/local/go/src/runtime/proc.go:203 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1373 helm ls --all

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION GitLab CI Pipeline Job https://gitlab.com/shortlink-org/shortlink/-/jobs/879030237

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.

+15

jlcrow on Nov 29, 2020

why is this issue still closed? repro: on empty cluster

helm upgrade --install --version=3.13.0 --create-namespace --namespace ingress-nginx-2 --set controller.kind=DaemonSet --set controller.service.type=LoadBalancer --set controller.service.loadBalancerIP=127.0.0.1 ingress-nginx-2 ingress-nginx/ingress-nginx

and ^C it

+12

matti on Dec 8, 2020

this problem is because skaffold/helm was interrupted, i fixed it with deleting the broken namespace

+11

CatalinBinzari on Oct 26, 2021

“fixed” by deleting all helm secrets

+11

matti on Jan 10, 2021

I still think there is a regression in 3.4.0 since I never had this issue before.

+10

Skaronator on Nov 9, 2020

@bacongobbler I haven’t had luck reproducing the issue, I’ve just bumped to 3.4.1 and upgraded the same deployment that previously failed under 3.4.0, will just assume the issue is resolved unless I see something else. Thanks for everything.

jlcrow on Nov 12, 2020

on v3.3.4 such case handled fine (see a picture attached). Using helm in GitLab CI and job cancelation became a problem after upgrading to 3.4.0. v3.5.1 has the same issue too.

Dimpison on Jan 29, 2021

same on v3.5.4, fixed by

kubectl delete secret sh.helm.release.v1.asdf.v1 -n asdf

This works for me. Thanks! @okunc

chance2021 on Mar 23, 2022

same on v3.5.4, fixed by

kubectl delete secret sh.helm.release.v1.asdf.v1 -n asdf

Thank you, deleting the last secret in the list fixed it in my case too. I’m also on helm v3.5.4. Also using fluxcd helm-controller - and to bring it back completely, after deleting that secret i also had to run:

flux resume helmrelease asdf -n asdf

And now all the failed / stuck releases are working again!

igorbrigadir on Sep 21, 2021

helm history -n (ns) xxx helm rollback -n (ns) xxx (num)

hellojack123 on Mar 7, 2022

“fixed” by deleting all helm secrets

you don’t need to delete all helm secrets but only the last one. it sounds like a workaround but not like fix.

vasili439 on Jan 11, 2021

same in 3.8.1, still when killing helm install --upgrade hard enough

matti on Mar 30, 2022

Workaround:

Write the secret representing the release out to memory
kubectl get secret -n <whatever> sh.helm.release.v1.<name>.v<version>    -o yaml > release.yaml
Unpack the secret so you can manually edit it
cat release.yaml | yq .data.release | base64 -d | base64 -d | gunzip > release-contents.json
Edit the JSON document that is unpacked, changing the “description” key to “Deployed” and the “status” key to the value “deployed”

Repack the json document
cat release.json | gzip | base64 -w 0 > newreleasevalue
Edit the file release.yaml, changing the label status to deployed and the data.release key to have as its value the contents of the file newreleasevalue created in step 6

Update the secret using the file release.yaml which you just updated:
kubectl apply -n <whatever> -f release.yaml
Run helm list to verify that your release shows up in the list of deployed stuff

Now you can re-run your helm command and it won’t die.

Getting this error when running kubectl apply after following those steps:

Warning: resource secrets/sh.helm.release.v1.nextcloud.v11 is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
The Secret "sh.helm.release.v1.nextcloud.v11" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

Also, the command you provided for repacking the release contents only encodes it once, I believe it should be doing that twice, and the yq command is giving output with quotes around. Used these commands instead: cat release.yaml | yq .data.release | sed 's/"//g' | base64 -d | base64 -d | gunzip > release-contents.json cat release-contents.json | gzip | base64 -w 0 | base64 -w 0 > newreleasevalue

UPDATE: found a workaround. Use kubectl edit to apply the changes instead of kubectl apply. Make sure you know how to paste in your terminal editor. I’d recommend just commenting out the old line to remove it instead of deleting it by hand, does same thing.

Nolij on Mar 19, 2022

Workaround:

Write the secret representing the release out to memory

kubectl get secret -n <whatever> sh.helm.release.v1.<name>.v<version>    -o yaml > release.yaml

Unpack the secret so you can manually edit it

cat release.yaml | yq .data.release | base64 -d | base64 -d | gunzip > release-contents.json

Edit the JSON document that is unpacked, changing the “description” key to “Deployed” and the “status” key to the value “deployed”
Repack the json document

cat release.json | gzip | base64 -w 0 > newreleasevalue

Edit the file release.yaml, changing the label status to deployed and the data.release key to have as its value the contents of the file newreleasevalue created in step 6
Update the secret using the file release.yaml which you just updated:

kubectl apply -n <whatever> -f release.yaml

Run helm list to verify that your release shows up in the list of deployed stuff
Now you can re-run your helm command and it won’t die.

djhaskin987 on Mar 10, 2022

Curiously, the way we currently fix it in our cluster is by running the same upgrade command with Helm version 3.2.1 (the Helm plugin for our CI uses 3.6.2, previously 3.2.4 which had no such issues). Not only it goes through just fine, but after that it works again with the current version.

VengefulAncient on Aug 5, 2021

v3.6.3 brought me here, same behavior.

MaxWinterstein on Sep 2, 2021

@VengefulAncient - Can you provide some more clarity on your fix please? Does v3.6.2 fix this issue? FYI - We are using v3.4.2 on our CI and seeing this issue. Thanks!

Sure. v3.6.2 is the problematic version. It’s the older version, v3.2.4 (not v3.4.2, AFAIK it’s already affected by the bug), that fixes the bug. I run exactly the same helm upgrade --install command as our CI from my local machine using this older version when this error comes up, and it updates the release just fine - and our CI with the newer Helm version also works after.

VengefulAncient on Aug 27, 2021

If it were that easy it would’ve been fixed by now.

But by all means, if you know how to fix it, we’d welcome a pull request. There’s instructions in the documentation for getting started.

I wonder what happened from v3.3 to v3.4 that caused this issue

ahmadalli on Jul 31, 2021

Had this issue because I tried to cancel a helm deployment from the command line. The workaround suggested by @Skaronator using a rollback got me past the error. Helm history looks like this now:

35              Thu Mar 18 10:09:15 2021        superseded      wordpress-0.1.6 5.4.2           Upgrade complete 
36              Thu Mar 18 10:32:16 2021        superseded      wordpress-0.1.6 5.4.2           Rollback to 34   
37              Thu Mar 18 10:42:48 2021        pending-upgrade wordpress-0.1.6 5.4.2           Preparing upgrade
38              Thu Mar 18 10:48:11 2021        superseded      wordpress-0.1.6 5.4.2           Rollback to 36   
39              Thu Mar 18 10:49:02 2021        deployed        wordpress-0.1.6 5.4.2           Upgrade complete

ghost on Mar 18, 2021

The OP determined his issue was a duplicate of #4558. As #4558 describes, there are a few cases where a helm upgrade can enter the PENDING_UPGRADE state in the event of a timeout. A helm rollback && helm upgrade resolves the issue; hence why it was closed as a duplicate of #4558 (the symptoms and the workaround is identical).

If you do not believe you are experiencing the same issue as the OP, please open a new ticket.

bacongobbler on Feb 3, 2021

I just tried it with

▶ helm3 version --short
v3.5.1+g32c2223

the end result is the same.

Why was this issue closed?

pantelis-karamolegkos on Feb 3, 2021