helm: Helm upgrade fails the release after adding a new resource
Output of helm version:
Client: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Output of kubectl version:
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:26:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.6-gke.13", GitCommit:"fcbc1d20b6bca1936c0317743055ac75aef608ce", GitTreeState:"clean", BuildDate:"2019-06-19T20:50:07Z", GoVersion:"go1.11.5b4", Compiler:"gc", Platform:"linux/amd64"}
Scenario:
- Helm install a chart with resources X, Y, and Z (it doesn’t seem to matter which).
- Helm upgrade chart to add resource W (in this case a CronJob) - success
$ helm upgrade --install --wait --timeout 600 --namespace myNamespace -f someValues.yaml test .
Release "test" has been upgraded.
LAST DEPLOYED: Wed Jul 17 10:14:58 2019
NAMESPACE: myNamespace
STATUS: DEPLOYED
<snip>
==> v1beta1/CronJob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
test-myCron * */1 * * * False 0 <none> 6s
<snip>
- Helm upgrade again, with or without changing anything - failure
$ helm upgrade --install --wait --timeout 600 --namespace myNamespace -f someValues.yaml test .
UPGRADE FAILED
Error: kind CronJob with the name "test-myCron" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart
Error: UPGRADE FAILED: kind CronJob with the name "test-myCron" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart
I can delete the CronJob (or whatever resource was added) with kubectl and repeat steps 2 and 3 with the same results. Adding --debug doesn’t add anything of value.
It seems related to #1193 but, if I’m reading it correctly, in that issue the deployment would have failed in step 2.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 39
- Comments: 32 (9 by maintainers)
Commits related to this issue
- fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to cristiklein/helm by cristiklein 4 years ago
- fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to cristiklein/helm by cristiklein 4 years ago
- fix(tiller): improve handling of corrupted storage For some reason, many users experince corrupted storage with the ConfigMaps storage backend. Specifically, several Releases are marked as DEPLOYED. ... — committed to helm/helm by cristiklein 4 years ago
That error indicates that you introduced a new resource into the chart, but that resource already existed in the cluster. Helm does not “adopt” existing resources; if a resource exists before it was introduced in the chart, it’s considered an error condition, hence the error message.
To fix this, remove the resource from the cluster using
kubectl delete, then callhelm upgradeagain.@bacongobbler the problem is that we did not add any resources. It’s like helm lost track of a resource it created. Removing the resource it is complaining about does solve the issue for the next run, but after some time the problem starts again.
I’m seeing this issue too.
Step 1> Deploy chart with new CronJob objects added (don’t yet exist in cluster). Deploy is successful. Step 2> Next deploy fails with error:
The only workaround we have found is to run
helm del --purgeon the release and re-deploy. We are using helm v2.14.3.This continues to happen to us consistently. This workaround seems to work
subsequent upgrades work for me. didnt see anything interesting in the log
This seems to be caused by Tiller’s database getting corrupted. For example, on one of my clusters:
In other words, Tiller’s internals might take either v99, v206 or v207 as the DEPLOYED version, which leads to the error above.
I ended up (brutally) deleting the extra CMs:
kubectl delete -n kube-system cm bar.v99.@bacongobbler I second @fernandoalex - I’m 100% sure that no new resources for chart had been added manually or by any other means in the cluster before the update the chart.
And it’s not related to specific resource type - in one case it failed when we added an HPA, in another a config map.
Given the fact that helm is quite spread and widely used in Kubernetes ecoystem as deployment tool, I don’t understand how people are using it in production …
Is it suppose to work like that ? break sporadically and require manual intervention to delete custom resources …
I am experiencing this problem as well.
Edit: Ok so I have found a temporary solution (which may not work for everyone) but at least it’s better than doing a ‘’‘helm delete --purge’‘’ which will delete all of your data. You can do:
to rollback to the previous revision of your helm deployment and then do a:
This worked for me.
We are experiencing the same issue but with HPA resource. EKS version:
1.11Helm version:2.14.2While trying to update the deployment with a new version of chart with added HPA resource, we are getting:
Removing manually the HPA resource and re-running the deployment fixes the situation but is not acceptable for production.
Try adding the
--cleanup-on-failflag to yourhelm upgradeinvocations. This flag will automatically remove any new resources the chart introduced during a failed upgrade. See #4871 for more info@bacongobbler i have not done super deep inspection. but this is not case. and somehow i’m not only one who has problem with prometheus-operator specifically. I’m deleting all resources it is asking. It just never stops saying about existant/non-existant resources. I update this chart almost instantly. and latest update has no huge changes in chart structure https://github.com/helm/charts/tree/0459eb2bdade42a88a44ee115184ba584bd3131c (pinned to problematic commit)
i will definitely try to inspect yaml closely.
But for now rollback has worked just fine.
@thedodd Now there should be a fix in Tiller master.
helm init --canary-imageAh, i finally understood why this error repeats even when we delete the service account. What really happens is when you apply your file with
helm upgrade -fHe assigns the first service account without any problem, that whyit takes some delay to respond, then he moves to assign the next one but surprise, here k8S finds that the service account is already there, so to fix this, simply don’t declare your service account more than once in your values.yaml file (this is only for my case).
I had this issue more than once, but this time I found something in common. It may have to do when you change subcharts version and (this part I don’t remember quite well) the upgrade fails for some reason. From that point, you’ll run over this problem again and again. Hope this helps!