rancher: [BUG] Rancher upgrade fail: Failed to find system chart "fleet"

Rancher Server Setup

  • Rancher version: v2.6.3
  • Installation option (Docker install/Helm Chart): Helm Chart, with useBundledSystemChart: true
    • If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE2 v1.22.7-rke2r1
  • Proxy/Cert Details: Certificates from files

Information about the Cluster

  • Kubernetes version: v1.22.7-rke2r1
  • Cluster Type (Local/Downstream): Local
    • If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider):

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom): Admin
    • If custom, define the set of permissions:

Describe the bug

When I tried to upgrade Rancher to version v2.6-99974f06038921d357c443cb4f17459d1377a51d-head, Rancher failed to upgrade “fleet” due to the following error:

[ERROR] available chart version (100.0.2+up0.3.8) for fleet is less than the min version (100.0.3+up0.3.9-rc1) 
[ERROR] Failed to find system chart fleet will try again in 5 seconds: no chart name found

It’s caused by the rancher-charts resource (of clusterrepo.catalog.cattle.io) is not updated even if the gitBranch has changed: image Here the spec.gitBranch is dev-v2.6, but the status.commit still shows the old commit. (5d21c199 is a very old).

To Reproduce

  1. Install Rancher v2.6.3 via Helm Chart on RKE2
  2. Retrieve chart for v2.6-99974f06038921d357c443cb4f17459d1377a51d-head: rancher-0.0.0-1647299882.commit-99974f060.release-v2-6.tgz
  3. Running helm get values -n cattle-system rancher -oyaml > values.yaml to dump the current value, then add rancherImageTag: v2.6-99974f06038921d357c443cb4f17459d1377a51d-head to the values.yaml
  4. Upgrade Rancher with the values.yaml using helm: helm upgrade rancher ./*.tgz --namespace cattle-system -f values.yaml
  5. Check the Rancher Pods log, and the error messages are shown.

Result fleet is not upgraded to the desired version v0.3.9-rc5:

kubectl get deploy -n cattle-fleet-system fleet-controller -o jsonpath='{.spec.template.spec.containers[0].image}'
rancher/fleet:v0.3.8

Expected Result

fleet should be upgraded to the desired version v0.3.9-rc5

Screenshots

Additional context Seems like this issue only occurred when useBundledSystemChart is set to true during the installation. When this value is absent, fleet successfully upgraded to v0.3.9-rc5.

SURE-4063

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 17 (12 by maintainers)

Commits related to this issue

Most upvoted comments

Based on the scenario of SURE-4063 which upgrades Rancher v2.5.x to Rancher v2.6.x I also upgraded from Rancher v2.5.12 (fleet v0.3.5) to Rancher v2.6.6 (fleet v0.3.9) The scenario is similar as described here but with RKE2 airgap

I checked the deployment after the upgrade and fleet was correctly upgraded to the expected v0.3.9 version.

kubectl get deploy -n cattle-fleet-system fleet-controller -o jsonpath='{.spec.template.spec.containers[0].image}'
<PRIVATE_REGiSTRY>/rancher/fleet:v0.3.9

@HarrisonWAffel I’ve done the same scenario I described but now on RKE2. I also didn’t include the step 2 (load images before upgrade).

If I don’t load the images in the private registry I end up with ErrImagePull ImagePullBackOff because the images don’t exist in the private registry. We must have those in the registry before upgrade.

I loaded the images to clear the image pull errors.

I did see the error message in the logs:

022/07/12 02:48:18 [ERROR] Failed to find system chart fleet will try again in 5 seconds: no chart name found
2022/07/12 02:48:23 [ERROR] Failed to find system chart fleet will try again in 5 seconds: no chart name found
2022/07/12 02:48:28 [ERROR] Failed to find system chart fleet will try again in 5 seconds: no chart name found
2022/07/12 02:48:33 [ERROR] Failed to find system chart fleet will try again in 5 seconds: no chart name found

But it recovered after a while when fleet got upgraded successfully from v0.3.8 to v0.3.9 (Rancher version v2.6.3 to v2.6.6)