rancher: [BUG] RKE2 Downstream Clusters Not Coming Up After Rancher Migration
Rancher Server Setup
- Rancher version: v2.6.9 && v2.7.0
- Installation option (Docker install/Helm Chart): Helm
- If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc): RKE2
- Proxy/Cert Details: byo-valid
Information about the Cluster
- Kubernetes version: v1.24.8+rke2r1
- Cluster Type (Local/Downstream): Downstream
- If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): AWS
User Information
- What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom): Admin
Describe the bug
When migrating rancher servers from one HA to another HA using Rancher Backups and Restore, the RKE2 downstream cluster is not coming back up. The status is stuck at Updating
with the following message: Configuring bootstrap node(s) <redacted>: waiting for plan to be applied
The RKE2 version I used for both Rancher versions is v1.24.8+rke2r1
and after changing the version to the default RKE2 version for each Rancher version (v1.24.4+rke2r1
&& v1.24.6+rke2r1
respectively) and redeploying all workloads, the RKE2 cluster for the v2.6.9 instance came up and was Active
.
The v2.7.0 RKE2 cluster is still showing the same status and message.
To Reproduce
- Deploy a Rancher HA instance on v2.6.9 && v2.7.0
- Create an AWS RKE1 downstream cluster (3 workers, 1 control plane, 1 etcd) using
v1.24.8
as the RKE version - Create an AWS RKE2 downstream cluster (3 workers, 2 control plane, 3 etcd) using
v1.24.8+rke2r1
as the RKE version - Wait for the downstream clusters to be
Active
- Install the Rancher Backups chart on your local cluster
- Create a backup in your preferred storage (I use an AWS S3 bucket)
- Bring up a new HA and point the load balancer to the new HA
- Use the backup to restore onto the new HA
- Install Rancher with the same version as the original HA
- Go to Cluster Management
- Check cluster statuses
Result
The RKE2 downstream cluster did not come back to the Active
status after the restore/migration
Expected Result
The RKE2 downstream cluster does come back to the Active
after the restore/migration
Screenshots
Showing the Status and Message:
Machine Pool:
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (6 by maintainers)
@snasovich reverted in most recent rcs for 2.6 and 2.7