helm: Error: UPGRADE FAILED: no resource with the name "anything_goes" found

Hi,

We are constantly hitting a problem that manifests itself with this Error: UPGRADE FAILED: no resource with the name "site-ssl" found, for example. They can appear after any innocuous update to a template. Could you, please, help me with understanding the problem. What causes those messages to appear?

I’ve been unsuccessful in triaging the issue further, it may happen anytime, haven’t really found a pattern yet.

Perhaps, there is a problem with how we deploy? helm upgrade hmmmmm /tmp/dapp-helm-chart-20171219-20899-1ppm74grrwrerq --set global.namespace=hmm --set global.env=test --set global.erlang_cookie=ODEzMTBlZjc5ZGY5NzQwYTM3ZDkwMzEx --set global.tests=no --set global.selenium_tests=no --namespace hmm --install --timeout 300

Helm: v2.7.2, v2.6.2, Kubernetes: v1.7.6, v1.8.5. I’ve tried every possible combination of these 4 versions, neither work.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 106
  • Comments: 72 (21 by maintainers)

Commits related to this issue

Most upvoted comments

Completely removing release from Helm via helm delete release works, but it is not a viable solution.

Why can’t Helm just overwrite whatever is currently installed? Aren’t we living in a declarative world with Kubernetes?

Try helm delete RELEASE_NAME --purge and install it again.

I can’t 100% confirm if this will always reproduce, but I’ve noticed this tends to happen in the following situation:

  1. I upgrade a Helm chart, including a new resource
  2. That upgrade fails, but the resource was created as part of the failed upgrade
  3. All subsequent upgrades fail

If I do a helm rollback to the last succeeded deploy and then try re-upgrading, it does seem to work.

+1, this is happening again with 2.9.1. Please reopen.

On Helm 2.9.1 I have encountered the same issue:

helm upgrade --install --namespace my-namespace my-stack stack
Error: UPGRADE FAILED: no ConfigMap with the name "my-stack-my-app" found

While I see this ConfigMap on my cluster.

Having the same issue and only workaround seems to be helm delete --purge release and install again!

A less destructive option is doing a helm rollback to the /current/ version (i.e. by 0 steps). I cannot guarantee success, but for us so far, it has always unwedged things successfully.

This has pretty much made helm unusable for regular production deploys for us 😦 We’re currently investigating doing things like passing --dry-run to helm and piping it to kubectl apply… Since this seems to affect only a subset of users, am unsure what it is that we are doing wrong 😦

For us, a simple rollback to the current revision has always worked:

helm ls
helm rollback <NAME> <current REVISION>

I was having this exact error. It looks like the issue is related to templates with multiple API objects similar to what @amritb saw. In my case, I had a template that had multiple API objects that could be toggled on and off similar to:

{{ if .Values.enabled }}
---
...

Breaking that into its own template file and cleaning up the orphaned objects that helm created and forgot about resolved the issue for me. It sounds like there is a bug in how helm gets previous config if the number of objects per template changes between releases.

Same problem. Everything was just fine yesterday and I did multiple upgrades. Today I just added a new yaml with service and deployment block separated with --- and the upgrade failed.

The interesting thing is, helm created the service and then complained about it (and didn’t do the deployment). I commented out the service and just ran upgrade with the deployment block - it worked. However, helm didn’t delete the service - which it should have as it’s removed from the yaml file.

Update: I manually deleted the service, uncommented it from the yaml and ran the upgrade - this time it worked like a charm!

I fixed the issue by

  1. delete those resources that complained by “helm upgrade”. (It says Not found but actually it can be found). Don’t delete the whole release, otherwise if in production you will be screwed completely.
  2. redo helm upgrade. Now this time it should be “Happy Helming” shows up. 😃

The same problem using helm 2.8.0. Kubernetes versions client=v1.8.6 and server=v1.8.5-gke.0.

$ helm upgrade bunny ./app --debug
[debug] Created tunnel using local port: '54274'

[debug] SERVER: "127.0.0.1:54274"

Error: UPGRADE FAILED: no ConfigMap with the name "bunny-proxy-config" found

But the configmap exists in $ kubectl get configmap. If I manually delete the configmap, it works, but next time it fails again.

Here is the configmap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: {{ template "proxy.fullname" . }}-config
  # namespace: {{ .Release.Namespace }} # I've tried adding and removing it
  labels: # labels are the same as labels from $ kubectl describe configmap bunny-proxy-config
    app: {{ template "proxy.name" . }}
    chart: {{ template "proxy.chart" . }}
    release: {{ .Release.Name }}
    heritage: {{ .Release.Service }}
data:
  asd: qwe

I ran into the same issue. @fbcbarbosa it looks like it was merged 2 weeks ago. It should hopefully be a part of the next release 2.14.0.

Note that this has not been fixed for existing charts in this state; you’ll still need to remove the old releases that are in state DEPLOYED for things to work again. @balboah just prevented the case where you can get into the “multiple releases marked as DEPLOYED” state. 😃

After tailing the tiller logs, I found that tiller was trying to update an old release at the same time:

[storage] 2018/02/14 18:25:40 updating release "s2osf.v10"
[storage] 2018/02/14 18:25:40 updating release "s2osf.v44"

Deleting the old configmap for s2osf.v10 and then upgrading worked.

Client: &version.Version{SemVer:"v2.7.2", GitCommit:"8478fb4fc723885b155c924d1c8c410b7a9444e6", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.7.2", GitCommit:"8478fb4fc723885b155c924d1c8c410b7a9444e6", GitTreeState:"clean"}

This is really easy to reproduce, happens if there is an error in manifest.

Like we have resource1 and resource2, resource2 depends on first. When we upgrade release, resource1 is created (eg PV & PVC), but resource2 fails. After this only deletion of resource1 helps, as helm always reports a problem on upgrade (PersistentVolume with name … not found)

Adding another datapoint: I appear to be having the exact same issue as @awwithro. We’re using a jinja loop to create multiple cronjobs via a template, and when a new upgrade caused this loop to fill in an additional cronjob, we ran into the bug. Seemed to trigger #2941 as well (or possibly one bug causes the other), and deleting the zombie configmaps fixes it.

We ran into this issue in PROD, when a requirement to our umbrella helm chart added a configmap based on a conditional. For us the work around fix was to

helm rollback <some revision that's acceptable>
helm upgrade <desired version>

I am hitting this issue too. I tried adding subchart with a deployment in my chart, it succeeded when upgraded with helm upgrade chart chart-1.0.1.tgz just for the first time, after that when I tried helm upgrade chart chart-1.0.1.tgz it failed with the error Error: UPGRADE FAILED: no Deployment with name "subchart-deployment" found

Client: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"} Server: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"}

The helm tiller logs just logs the same error. Anyone experiencing this too?

It seems very easy to reproduce it manually, without intentionally trying to upgrade a chart with harmful changes (for example, modifying an immutable Job objects):

  1. Take some chart and deploy it (but omit one resource, let’s say a Service)
  2. Add omitted resource manually (for example, with “kubectl create”), but with the name corresponding to the release
  3. Add the omitted resource back to the chart and then try to upgrade it, helm should report “UPGRADE FAILED: no <kind> with the name <name> found”

The steps are different but still the root cause seems the same. Correct me if I’m wrong on the assumption, but it seems to me that the last DEPLOYED release’s revision does not have information about the particular resource, either because it was added “outside” Helm (manually for example) or the latest upgrade failed on some step (let’s say on upgrading an immutable Job), at the same time deploying other objects and then recording them in the FAILED revision (but still without any track in the DEPLOYED revision what is expected, otherwise it would mean changing the history). On the next run the Tiller’s kube client sees the resources on the cluster, meaning they should be already deployed and thus recorded, it checks the latest DEPLOYED revision (it seems the FAILED revision is not contacted at all) and does not see them listed there so it reports error.

Hm, I still get this issue on Helm 2.8.2 (not the latest, but I tried with 2.9.0 and it gives the same error.) Usually deleting the offending resource manually can fix it, though often it cascades into multiple resources that all need deletion before it successfully upgrades.

I have a bit of a large helm chart with nested dependencies; might that be it?

I also needed to delete affected resources. Not good for a production environment =_(

I’ve been using helm for one week and already faced everything outlined here https://medium.com/@7mind_dev/the-problems-with-helm-72a48c50cb45

A lot needs fixing here.

On Fri, Mar 15, 2019, 10:49 PM Tom Davis notifications@github.com wrote:

Same, @thedumbtechguy https://github.com/thedumbtechguy. I run into this issue routinely. It’s especially fun when Helm decides you need to delete all your secrets, configmaps, roles, etc. Upgrading becomes a game of wack-a-mole with an ever-increasing list of arguments to kubectl delete. I should have thrown in the towel on this sisyphean task months ago, but it’s too late for that now. Sure hope this and the dozens of similar issues can be fixed!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/helm/helm/issues/3275#issuecomment-473464809, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4XZU4KMQePtZKcir8S5kWulkbYg-8Uks5vXCNggaJpZM4RGz7W .

I experience this issue if I have a multiple resources with hooks in one file

if your last resort is about deleting the old release, there might be a less destructive work around as my comment https://github.com/kubernetes/helm/issues/3513#issuecomment-366918019

basically finding that old revision in the logs and editing the configmap manually where tiller stores the deployed status. There should not be two revisions with DEPLOYED status afaik.

We had the same issue (the resource that got us was Secrets). Removing the new secrets and re-deploying fixed it.

Do note that because of the failures, we now have 11 different releases when we do helm list, 10 FAILED ones and 1 DEPLOYED. That’s not expected, right? Same issue as here it seems: https://github.com/kubernetes/helm/issues/2941

Oh, and something I failed to mention: because the cluster’s in an inconsistent state, this can easily be worked around by manually intervening and deleting the resource that the error reports as “not found”. Following the example I demonstrated in https://github.com/kubernetes/helm/pull/4223#issuecomment-397413568:

><> helm fetch --untar https://github.com/kubernetes/helm/files/2103643/foo-0.1.0.tar.gz
><> helm install ./foo/
...
><> vim foo/templates/service.yaml
><> kubectl create -f foo/templates/service.yaml
service "foo-bar" created
><> helm upgrade $(helm last) ./foo/
Error: UPGRADE FAILED: no Service with the name "foo-bar" found
><> kubectl delete svc foo-bar
service "foo-bar" deleted
><> helm upgrade $(helm last) ./foo/
Release "riotous-echidna" has been upgraded. Happy Helming!
...
><> kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
foo-bar      ClusterIP   10.104.143.52   <none>        80/TCP    3s
kubernetes   ClusterIP   10.96.0.1       <none>        443/TCP   1h

thank you @ramyala for testing the fix! I’ll mention it in the dev call tomorrow and see if any of the other core maintainers see any edge cases that may come up with the patch. If not let’s merge.

We believe this has been fixed (moving forward) via #3539. Please re-open if we happen to be wrong. 😃

Found a new solution to this problem.

kubectl -n kube-system edit cm name_of_your_release.v2, where v2 is the latest revision number marked as FAILED in helm list. You might also want to edit one of the DEPLOYED releases and change status to SUPERSEDED, so that we wont have two deployed releases at the same time.

Having the same issue as @binoculars:

[storage] 2018/02/15 10:20:50 updating release "control.v136"
[storage] 2018/02/15 10:20:50 updating release "control.v226"

Causing weird problems with UPGRADE FAILED: no Secret with the name "foobar" found. I even tried deleting this secret which then caused errors on some configmap instead, and at 3rd run it once again complained on the previous secret.

This might have been triggered after upgrading from helm 2.7.x to 2.8.1.


Client: &version.Version{SemVer:"v2.8.1", GitCommit:"6af75a8fd72e2aa18a2b278cfe5c7a1c5feca7f2", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.8.1", GitCommit:"6af75a8fd72e2aa18a2b278cfe5c7a1c5feca7f2", GitTreeState:"clean"}

Like @amritb, after I manually deleted the object that helm initially failed on, it succeeded after the next upgrade. I did not experience #2941.

I’ve faced the issue as well when we upgrade an release if there are multiple DEPLOYED status of this release, Have to fix it with delete those corresponding configmaps.

I had this problem - it was due to a PersistentVolume that i’d created. To resolve, I deleted the PV and PVC. Ran helm upgrade XXX XXX and it worked fine. Probably still something that should be investigated as the PV did exist.