rancher: [BUG] Etcd restore does not work on an RKE2 cluster
Rancher Server Setup
- Rancher version: 2.8-head commit id:
d101c27
- Installation option (Docker install/Helm Chart): Docker Install
Information about the Cluster
- Kubernetes version:
1.27.5+rke2r1
tov1.26.8+rke2r1
RKE2 - Cluster Type (Local/Downstream): AWS Node driver cluster
User Information
- What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom) Standard
Describe the bug [BUG] Etcd restore does not work on an RKE2 cluster
To Reproduce
-
Deploy a downstream RKE2 node driver cluster on 1.26 RKE2 version
-
Take an etcd snapshot
-
Upgrade to 1.27 RKE2 version
-
Restore using
All options - config, k8s and etcd
option to the snapshot taken previously -
Cluster is stuck in
Updating
state error:[INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd
-
rancher prov logs:
[INFO ] provisioning done
--
4:52:48 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for plan to be applied
4:52:54 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for probes: kubelet
4:53:30 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for probes: etcd
4:53:36 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for kubelet to update
4:53:46 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-q89sr,rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm
4:54:28 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: kubelet
4:54:46 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: etcd, kubelet
4:54:50 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: etcd
4:54:56 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for kubelet to update
4:55:04 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
4:56:10 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for plan to be applied
4:56:16 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
4:56:20 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kubelet
4:56:44 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kubelet
4:56:54 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager, kube-scheduler, kubelet
4:56:58 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager, kube-scheduler
4:57:12 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager
4:57:18 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-4dhrj,rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:57:34 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:57:54 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for plan to be applied
4:57:58 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for probes: kubelet
4:58:08 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for kubelet to update
4:58:46 pm | [INFO ] rke2-backup-restore-wk-56df7d58b5xb6ffp-4dhrj,rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:58:48 pm | [INFO ] provisioning done
5:01:26 pm | [INFO ] refreshing etcd restore state
5:01:28 pm | [INFO ] waiting to stop rke2 services on node [rke2-backup-restore-cp-bfd6beba-nz8l2]
5:01:30 pm | [INFO ] waiting to stop rke2 services on node [rke2-backup-restore-wk-e548aa18-h5k2c]
5:01:32 pm | [INFO ] waiting for etcd restore
5:02:16 pm | [INFO ] waiting for etcd restore probes
5:02:54 pm | [INFO ] waiting for etcd restore
5:05:02 pm | [INFO ] waiting for etcd restore probes
5:05:16 pm | [INFO ] refreshing etcd restore state
5:05:18 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw,rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm
5:05:34 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
5:05:36 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd, kubelet
5:05:46 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
5:05:56 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
5:05:58 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd
Note:
- On an RKE1 cluster, this use case works. No issues seen.
- On an rke2 cluster - Cluster upgrade from
1.26.8+rke2r1
to1.27.5+rke2r1
works but the restore to snapshot taken on 1.26 fails.
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Comments: 17 (16 by maintainers)
@felipe-colussi I tested this on v2.7.8 an a
v1.25.13+rke2r1
rke2 cluster, and got that same error. Also I ran the same test on k3sv1.25.13+k3s1
and it worked fine.