helm: Error: UPGRADE FAILED: no resource with the name "anything_goes" found
Hi,
We are constantly hitting a problem that manifests itself with this Error: UPGRADE FAILED: no resource with the name "site-ssl" found, for example. They can appear after any innocuous update to a template.
Could you, please, help me with understanding the problem. What causes those messages to appear?
I’ve been unsuccessful in triaging the issue further, it may happen anytime, haven’t really found a pattern yet.
Perhaps, there is a problem with how we deploy? helm upgrade hmmmmm /tmp/dapp-helm-chart-20171219-20899-1ppm74grrwrerq --set global.namespace=hmm --set global.env=test --set global.erlang_cookie=ODEzMTBlZjc5ZGY5NzQwYTM3ZDkwMzEx --set global.tests=no --set global.selenium_tests=no --namespace hmm --install --timeout 300
Helm: v2.7.2, v2.6.2, Kubernetes: v1.7.6, v1.8.5. I’ve tried every possible combination of these 4 versions, neither work.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 106
- Comments: 72 (21 by maintainers)
Commits related to this issue
- Pin helm version to v2.6.2 https://github.com/kubernetes/helm/issues/3275 has caused serious amounts of downtime in the last week or so for various deployments at Berkeley, so let's recommend people ... — committed to yuvipanda/zero-to-jupyterhub-k8s by yuvipanda 6 years ago
- fix(tiller): Supersede multiple deployments There are cases when multiple revisions of a release has been marked with DEPLOYED status. This makes sure any previous deployment will be set to SUPERSEDE... — committed to joonix/helm by balboah 6 years ago
- fix(tiller): Supersede multiple deployments (#3539) * add test for rolling back from a FAILED deployment * Update naming of release variables Use same naming as the rest of the file. * Updat... — committed to helm/helm by balboah 6 years ago
- fix(tiller): Supersede multiple deployments (#3539) * add test for rolling back from a FAILED deployment * Update naming of release variables Use same naming as the rest of the file. * Update roll... — committed to helm/helm by balboah 6 years ago
- fix(tiller): Supersede multiple deployments (#3539) * add test for rolling back from a FAILED deployment * Update naming of release variables Use same naming as the rest of the file. * Updat... — committed to splisson/helm by balboah 6 years ago
- fix(tiller): Supersede multiple deployments (#3539) * add test for rolling back from a FAILED deployment * Update naming of release variables Use same naming as the rest of the file. * Updat... — committed to splisson/helm by balboah 6 years ago
- fix(tiller): Supersede multiple deployments (#3539) * add test for rolling back from a FAILED deployment * Update naming of release variables Use same naming as the rest of the file. * Updat... — committed to jianghang8421/helm by balboah 6 years ago
Completely removing release from Helm via
helm delete releaseworks, but it is not a viable solution.Why can’t Helm just overwrite whatever is currently installed? Aren’t we living in a declarative world with Kubernetes?
Try
helm delete RELEASE_NAME --purgeand install it again.I can’t 100% confirm if this will always reproduce, but I’ve noticed this tends to happen in the following situation:
If I do a
helm rollbackto the last succeeded deploy and then try re-upgrading, it does seem to work.+1, this is happening again with 2.9.1. Please reopen.
On Helm
2.9.1I have encountered the same issue:While I see this ConfigMap on my cluster.
A less destructive option is doing a
helm rollbackto the /current/ version (i.e. by 0 steps). I cannot guarantee success, but for us so far, it has always unwedged things successfully.This has pretty much made helm unusable for regular production deploys for us 😦 We’re currently investigating doing things like passing --dry-run to helm and piping it to kubectl apply… Since this seems to affect only a subset of users, am unsure what it is that we are doing wrong 😦
For us, a simple rollback to the current revision has always worked:
I was having this exact error. It looks like the issue is related to templates with multiple API objects similar to what @amritb saw. In my case, I had a template that had multiple API objects that could be toggled on and off similar to:
Breaking that into its own template file and cleaning up the orphaned objects that helm created and forgot about resolved the issue for me. It sounds like there is a bug in how helm gets previous config if the number of objects per template changes between releases.
Same problem. Everything was just fine yesterday and I did multiple upgrades. Today I just added a new yaml with
serviceanddeploymentblock separated with---and the upgrade failed.The interesting thing is, helm created the
serviceand then complained about it (and didn’t do the deployment). I commented out theserviceand just ran upgrade with thedeploymentblock - it worked. However, helm didn’t delete the service - which it should have as it’s removed from the yaml file.Update: I manually deleted the
service, uncommented it from the yaml and ran the upgrade - this time it worked like a charm!I fixed the issue by
The same problem using helm
2.8.0. Kubernetes versionsclient=v1.8.6andserver=v1.8.5-gke.0.But the configmap exists in
$ kubectl get configmap. If I manually delete the configmap, it works, but next time it fails again.Here is the configmap:
I ran into the same issue. @fbcbarbosa it looks like it was merged 2 weeks ago. It should hopefully be a part of the next release 2.14.0.
Note that this has not been fixed for existing charts in this state; you’ll still need to remove the old releases that are in state DEPLOYED for things to work again. @balboah just prevented the case where you can get into the “multiple releases marked as DEPLOYED” state. 😃
After tailing the tiller logs, I found that tiller was trying to update an old release at the same time:
Deleting the old configmap for s2osf.v10 and then upgrading worked.
This is really easy to reproduce, happens if there is an error in manifest.
Like we have resource1 and resource2, resource2 depends on first. When we upgrade release, resource1 is created (eg PV & PVC), but resource2 fails. After this only deletion of resource1 helps, as helm always reports a problem on upgrade (PersistentVolume with name … not found)
Adding another datapoint: I appear to be having the exact same issue as @awwithro. We’re using a jinja loop to create multiple cronjobs via a template, and when a new upgrade caused this loop to fill in an additional cronjob, we ran into the bug. Seemed to trigger #2941 as well (or possibly one bug causes the other), and deleting the zombie configmaps fixes it.
We ran into this issue in PROD, when a requirement to our umbrella helm chart added a configmap based on a conditional. For us the work around fix was to
I am hitting this issue too. I tried adding subchart with a deployment in my chart, it succeeded when upgraded with
helm upgrade chart chart-1.0.1.tgzjust for the first time, after that when I triedhelm upgrade chart chart-1.0.1.tgzit failed with the errorError: UPGRADE FAILED: no Deployment with name "subchart-deployment" foundClient: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"}The helm tiller logs just logs the same error. Anyone experiencing this too?
It seems very easy to reproduce it manually, without intentionally trying to upgrade a chart with harmful changes (for example, modifying an immutable Job objects):
The steps are different but still the root cause seems the same. Correct me if I’m wrong on the assumption, but it seems to me that the last DEPLOYED release’s revision does not have information about the particular resource, either because it was added “outside” Helm (manually for example) or the latest upgrade failed on some step (let’s say on upgrading an immutable Job), at the same time deploying other objects and then recording them in the FAILED revision (but still without any track in the DEPLOYED revision what is expected, otherwise it would mean changing the history). On the next run the Tiller’s kube client sees the resources on the cluster, meaning they should be already deployed and thus recorded, it checks the latest DEPLOYED revision (it seems the FAILED revision is not contacted at all) and does not see them listed there so it reports error.
Hm, I still get this issue on Helm 2.8.2 (not the latest, but I tried with 2.9.0 and it gives the same error.) Usually deleting the offending resource manually can fix it, though often it cascades into multiple resources that all need deletion before it successfully upgrades.
I have a bit of a large helm chart with nested dependencies; might that be it?
I also needed to delete affected resources. Not good for a production environment =_(
I’ve been using helm for one week and already faced everything outlined here https://medium.com/@7mind_dev/the-problems-with-helm-72a48c50cb45
A lot needs fixing here.
On Fri, Mar 15, 2019, 10:49 PM Tom Davis notifications@github.com wrote:
I experience this issue if I have a multiple resources with hooks in one file
if your last resort is about deleting the old release, there might be a less destructive work around as my comment https://github.com/kubernetes/helm/issues/3513#issuecomment-366918019
basically finding that old revision in the logs and editing the configmap manually where tiller stores the deployed status. There should not be two revisions with DEPLOYED status afaik.
We had the same issue (the resource that got us was Secrets). Removing the new secrets and re-deploying fixed it.
Do note that because of the failures, we now have 11 different releases when we do
helm list, 10 FAILED ones and 1 DEPLOYED. That’s not expected, right? Same issue as here it seems: https://github.com/kubernetes/helm/issues/2941see https://github.com/helm/helm/issues/1193#issuecomment-419555433.
Oh, and something I failed to mention: because the cluster’s in an inconsistent state, this can easily be worked around by manually intervening and deleting the resource that the error reports as “not found”. Following the example I demonstrated in https://github.com/kubernetes/helm/pull/4223#issuecomment-397413568:
thank you @ramyala for testing the fix! I’ll mention it in the dev call tomorrow and see if any of the other core maintainers see any edge cases that may come up with the patch. If not let’s merge.
We believe this has been fixed (moving forward) via #3539. Please re-open if we happen to be wrong. 😃
Found a new solution to this problem.
kubectl -n kube-system edit cm name_of_your_release.v2, where v2 is the latest revision number marked as FAILED inhelm list. You might also want to edit one of the DEPLOYED releases and change status to SUPERSEDED, so that we wont have two deployed releases at the same time.Having the same issue as @binoculars:
Causing weird problems with
UPGRADE FAILED: no Secret with the name "foobar" found. I even tried deleting this secret which then caused errors on some configmap instead, and at 3rd run it once again complained on the previous secret.This might have been triggered after upgrading from helm 2.7.x to 2.8.1.
Like @amritb, after I manually deleted the object that helm initially failed on, it succeeded after the next upgrade. I did not experience #2941.
I’ve faced the issue as well when we upgrade an release if there are multiple
DEPLOYEDstatus of this release, Have to fix it with delete those corresponding configmaps.I had this problem - it was due to a PersistentVolume that i’d created. To resolve, I deleted the PV and PVC. Ran
helm upgrade XXX XXXand it worked fine. Probably still something that should be investigated as the PV did exist.