longhorn: [BUG] Upgrade failed through Rancher (2.5.5) UI from 1.2.0 -> 1.2.2
Describe the bug
The upgrade of Longhorn from version 1.2.0 to 1.2.2 failed with the following error message
Failed to install app longhorn-system. Error: UPGRADE FAILED: kind CustomResourceDefinition with the name "backups.longhorn.io" already exists in the cluster and wasn't defined in the previous release. Before upgrading, please either delete the resource from the cluster or remove it from the chart
To Reproduce
Steps to reproduce the behavior:
- Go to Apps
- Click on Longhorn Upgrade
- Review settings and start
- See error message above in the longhorn chart overview
Expected behavior
Expected that all workloads from longhorn will be updated to version 1.2.2
Log or Support bundle
If applicable, add the Longhorn managers’ log or support bundle when the issue happens. You can generate a Support Bundle using the link at the footer of the Longhorn UI.
Environment
- Longhorn version: 1.2.0
- Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Rancher Catalog App
- Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version: RKE
- Number of management node in the cluster: 7
- Number of worker node in the cluster: 2
- Node config
- OS type and version: Ubuntu Server 20.04
- CPU per node: 4
- Memory per node: 64
- Disk type(e.g. SSD/NVMe): SSD/HDD/NVMe
- Network bandwidth between the nodes: 10G
- Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal): Baremetal/VM
- Number of Longhorn volumes in the cluster: 26
Additional context
Add any other context about the problem here.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 18 (6 by maintainers)
This allows you to upgrade to Helm 3 (it’s actually not related to Rancher 2.6 or Rancher 2.5, but I ended up doing this because Rancher 2.6 removed Helm 2, but ultimately it’s a similar problem). I assumed this is your problem because this is the same error I got when I tried to switch my Helm 2 version of Longhorn to Helm 3.
My proposed fix does not delete the CRDs. Note the commands do not delete any resources. It modifies the CRDs so you can upgrade in place. Following this, I was able to continue using Longhorn without deleting any volumes and after upgrading PVCs and restarting pods, it worked perfectly.
@ajacques Thank you so much!! With option 3 I was able to upgrade longhorn and after that successfully update rancher to the latest version and I can still manage everything correctly 🥳
If you run
Does it show a release named longhorn-system and longhorn-crds?
There’s three options:
longhorn-systemtolonghornusing something like thislonghorn-systemrelease by changing theannotate -n longhorn-system % meta.helm.sh/release-name=longhornparts in my script toannotate -n longhorn-system % meta.helm.sh/release-name=longhorn-systemOn further reading of your GitHub issue it seems like your issue was that the CRDs were not created using Helm, but the Longhorn application itself was? If so, then option 3 should be your best bet and can be used to undo the part that is currently broken and should permit upgrades to continue:
Changing the annotation doesn’t do much other than tell Helm that yes it’s okay to touch those resources.
If you delete the CRDs, doesn’t that mean you have to delete the K8s resources matching the CRDs too which would mean you lose PVCs?
Instead of deleting CRDs, I figured out a way to upgrade to Helm 3 in place and wrote a blog post on how I upgraded it here without deleting any data or CRDs. Ultimately, I used kubectl change the labels and annotations. Then used Helm to deploy over top of the existing resources and it worked. I think this should be the same fix for your case.
Yes, you’re going to deploy it from the Marketplace since that’ll ensure it gets installed with Helm v3.
This can be fixed by running (and I’ve updated the doc):
However, this error is interesting. AFAIK, the release-name was supposed to be ‘longhorn’, not ‘longhorn-system’, as per 1, 2, and 3 It’s easy to fix, but maybe somebody else knows if this is correct or not?