harvester: [BUG] reinstall 1st node
Describe the bug I tried to reinstall 1st node (installed with mode: create) and the node after reinstallation didn`t joined existing cluster.
To Reproduce Steps to reproduce the behavior:
- install 3 node cluster
- turn off 1st node (installed with mode: create), remove it from cluster via GUI - hosts - delete host, wipe all disks (with fdisk remove partititions and mkfs.ext4 /dev/sda) and reinstall this node (install mode: join) to join existing cluster with rest 2 nodes with new hostname.
- reinstalled node is not able to join existing cluster (rancher bootstrap finished successfully, but node is not joined).
Expected behavior Reinstalled node should join existing cluster with rest 2 nodes.
Support bundle can provide if needed
Environment
- Harvester ISO version: v1.0.3
- Underlying Infrastructure (e.g. Baremetal with Dell PowerEdge R630): tried on 2 different environments with different HW
Additional context When i tried to reinstall 2nd or 3rd node (installation mode: join) they join the cluster successfully after reinstallation. The only problem is with the 1st node.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 22 (11 by maintainers)
Test Plan 2: Reinstall management node and agent node in a new Harvester cluster
Result
Verified after upgrading from v1.0.3 to
v1.1.0-rc3,we can rejoin the management node and agent node back correctly.Successfully re-join the management node after upgrade

Successfully re-join the agent node after upgrade

Test Information
v1.1.0-rc3Verify Steps
Test Plan 1: Reinstall management node and agent node in a upgraded cluster
Result
Verified after upgrading from v1.0.3 to v1.1.0 master release, we can rejoin the management node and agent node back correctly.
Successfully re-join the management node after upgrade
Successfully re-join the agent node after upgrade
Test Information
master-0a9538a1-head(10/14)Verify Steps
provisioning.cattle.io/v1/clusters -> fleet-localhelm.cattle.io/v1/helmchartconfigs -> rke2-canal@FrankYang0529 Great!
Please help check if #2470 is also caused by this bug, thanks.
This issue can be fixed https://github.com/harvester/harvester-installer/pull/344 and https://github.com/rancher/rancherd/pull/25. Following are my test steps with vagrant-pxe-harvester.
Case 1: Remove the second node
kubectl delete node harvester-node-1to remove the node CR.vagrant destroy harvester-node-1to remove the node VM.vagrant up harvester-node-1to start the node again. It should join the cluster and the status will beReady.Case 2: Remove the first node
kubectl delete node harvester-node-0to remove the node CR.vagrant destroy harvester-node-0to remove the node VM.vagrant ssh pxe_serverto update some content.vagrant up harvester-node-0to start the node again. It should join the cluster and the status will beReady.@tgazik we are still working on it. and it seems it’s easy to reproduce. So if you need the cluster, you could reinstall it. Thanks a lot
Sure i will keep it, no problem. Thanks for update