helm: Helm upgrade fails the release after adding a new resource

Output of helm version:

Client: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}

Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.6-gke.13", GitCommit:"fcbc1d20b6bca1936c0317743055ac75aef608ce", GitTreeState:"clean", BuildDate:"2019-06-19T20:50:07Z", GoVersion:"go1.11.5b4", Compiler:"gc", Platform:"linux/amd64"}

Scenario:

Helm install a chart with resources X, Y, and Z (it doesn’t seem to matter which).
Helm upgrade chart to add resource W (in this case a CronJob) - success

$ helm upgrade --install --wait --timeout 600 --namespace myNamespace -f someValues.yaml test .
Release "test" has been upgraded.
LAST DEPLOYED: Wed Jul 17 10:14:58 2019
NAMESPACE: myNamespace
STATUS: DEPLOYED
<snip>
==> v1beta1/CronJob
NAME                          SCHEDULE     SUSPEND  ACTIVE  LAST SCHEDULE  AGE
test-myCron * */1 * * *  False    0       <none>         6s
<snip>

Helm upgrade again, with or without changing anything - failure

$ helm upgrade --install --wait --timeout 600 --namespace myNamespace -f someValues.yaml test .
UPGRADE FAILED
Error: kind CronJob with the name "test-myCron" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart
Error: UPGRADE FAILED: kind CronJob with the name "test-myCron" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart

I can delete the CronJob (or whatever resource was added) with kubectl and repeat steps 2 and 3 with the same results. Adding --debug doesn’t add anything of value.

It seems related to #1193 but, if I’m reading it correctly, in that issue the deployment would have failed in step 2.

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 39
Comments: 32 (9 by maintainers)

Commits related to this issue

fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to cristiklein/helm by cristiklein 4 years ago
fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to cristiklein/helm by cristiklein 4 years ago
fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to helm/helm by cristiklein 4 years ago

Most upvoted comments

That error indicates that you introduced a new resource into the chart, but that resource already existed in the cluster. Helm does not “adopt” existing resources; if a resource exists before it was introduced in the chart, it’s considered an error condition, hence the error message.

To fix this, remove the resource from the cluster using kubectl delete, then call helm upgrade again.

+43

bacongobbler on Sep 15, 2019

@bacongobbler the problem is that we did not add any resources. It’s like helm lost track of a resource it created. Removing the resource it is complaining about does solve the issue for the next run, but after some time the problem starts again.

+22

fernandoalex on Sep 15, 2019

I’m seeing this issue too.
Step 1> Deploy chart with new CronJob objects added (don’t yet exist in cluster). Deploy is successful. Step 2> Next deploy fails with error:

Error: UPGRADE FAILED: kind CronJob with the name "staging-sitemap-cron-job" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart
Detected that there are existing resources.

The only workaround we have found is to run helm del --purge on the release and re-deploy. We are using helm v2.14.3.

+17

mjhuber on Sep 18, 2019

This continues to happen to us consistently. This workaround seems to work

upgrade with the new resource
upgrade again (it fails)
rollback the failed upgrade

subsequent upgrades work for me. didnt see anything interesting in the log

+15

hoffoo on Oct 18, 2019

This seems to be caused by Tiller’s database getting corrupted. For example, on one of my clusters:

$ kubectl get cm -n kube-system -l OWNER=TILLER,STATUS=DEPLOYED 
NAME                       DATA   AGE
foo.v4                     1      8m12s
bar.v206   1      8h
bar.v207   1      8h
bar.v99    1      33d

In other words, Tiller’s internals might take either v99, v206 or v207 as the DEPLOYED version, which leads to the error above.

I ended up (brutally) deleting the extra CMs: kubectl delete -n kube-system cm bar.v99.

+14

cristiklein on Jan 1, 2020

@bacongobbler I second @fernandoalex - I’m 100% sure that no new resources for chart had been added manually or by any other means in the cluster before the update the chart.

And it’s not related to specific resource type - in one case it failed when we added an HPA, in another a config map.

Given the fact that helm is quite spread and widely used in Kubernetes ecoystem as deployment tool, I don’t understand how people are using it in production …

Is it suppose to work like that ? break sporadically and require manual intervention to delete custom resources …

Constantin07 on Sep 16, 2019

our helm version is 2.14.3

I am experiencing this problem as well.

UPGRADE FAILED
Error: kind Service with the name "XXXXXXXXXXXX" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart

Edit: Ok so I have found a temporary solution (which may not work for everyone) but at least it’s better than doing a ‘’‘helm delete --purge’‘’ which will delete all of your data. You can do:

helm rollback <release-name> 0

to rollback to the previous revision of your helm deployment and then do a:

helm upgrade <release-name> .

This worked for me.

erenatas on Nov 11, 2019

We are experiencing the same issue but with HPA resource. EKS version: 1.11 Helm version: 2.14.2

While trying to update the deployment with a new version of chart with added HPA resource, we are getting:

...
helmfile sync
Building dependency chart
No requirements found in chart/charts.
Upgrading chart
UPGRADE FAILED
Error: kind HorizontalPodAutoscaler with the name "some_service" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart

Removing manually the HPA resource and re-running the deployment fixes the situation but is not acceptable for production.

Constantin07 on Jul 18, 2019

Try adding the --cleanup-on-fail flag to your helm upgrade invocations. This flag will automatically remove any new resources the chart introduced during a failed upgrade. See #4871 for more info

bacongobbler on Oct 7, 2019

@bacongobbler i have not done super deep inspection. but this is not case. and somehow i’m not only one who has problem with prometheus-operator specifically. I’m deleting all resources it is asking. It just never stops saying about existant/non-existant resources. I update this chart almost instantly. and latest update has no huge changes in chart structure https://github.com/helm/charts/tree/0459eb2bdade42a88a44ee115184ba584bd3131c (pinned to problematic commit)

i will definitely try to inspect yaml closely.

But for now rollback has worked just fine.

den-is on Sep 15, 2019

@thedodd Now there should be a fix in Tiller master. helm init --canary-image

cristiklein on Feb 6, 2020

Ah, i finally understood why this error repeats even when we delete the service account. What really happens is when you apply your file with

helm upgrade -f

He assigns the first service account without any problem, that whyit takes some delay to respond, then he moves to assign the next one but surprise, here k8S finds that the service account is already there, so to fix this, simply don’t declare your service account more than once in your values.yaml file (this is only for my case).

Keydero on Dec 6, 2019

I had this issue more than once, but this time I found something in common. It may have to do when you change subcharts version and (this part I don’t remember quite well) the upgrade fails for some reason. From that point, you’ll run over this problem again and again. Hope this helps!

elpadrinoIV on Sep 20, 2019