k3s: Removing node doesn't remove node password
I’m not sure if that’s the right place for the bug report, because the error message I got has only one google results and it’s pointing to commit message that added password validation below, so here it is.
I have a few rpis: 1, 0W. I installed hypriot on them and after installation of k3s changed some of their hostnames. I changed hostname of black-pearl to rpi1, removed black-pearl node from k3s-server, created another black-pearl on RPI 0W and here comes the problem, k3s of rpi0 (black-pearl) couldn’t join cluster because password didn’t match:
k3s-agent:
level=info msg="Running load balancer 127.0.0.1:41241 ->[k3s.local:6443]"
level=error msg="Node password rejected, contents of '/var/lib/rancher/k3s/agent/node-password.txt' may not match server passwd entry"
level=error msg="Node password rejected, contents of '/var/lib/rancher/k3s/agent/node-password.txt' may not match server passwd entry"
level=error msg="Node password rejected, contents of '/var/lib/rancher/k3s/agent/node-password.txt' may not match server passwd entry"
level=error msg="Node password rejected, contents of '/var/lib/rancher/k3s/agent/node-password.txt' may not match server passwd entry"
level=error msg="Node password rejected, contents of '/var/lib/rancher/k3s/agent/node-password.txt' may not match server passwd entry"
I spent some time trying to fix it and noticed that old password for black-pearl which is now rpi1 is still in
/var/lib/rancher/k3s/server/cred/node-passwd despite running kubectl delete black-pearl.
Seems that removing node should also remove password for that node in case another node with same hostname (OS is reinstalled?) re-joins the cluster.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 12
- Comments: 36 (13 by maintainers)
I experienced a very similar issue:
kubectl -n kube-system delete secrets <agent-node-name>.node-password.k3s.In my case, I uninstalled (via script) and removed the node via
kubectl. Then upon reinstall this issue popped up.Uninstalling again and then removing the entry from
{data-dir}/server/cred/node-passwd(default /var/lib/rancher/k3s/server/cred/node-passwd) worked for me.It probably should, we are cleaning up CoreDNS hosts entry here: https://github.com/rancher/k3s/blob/36ca6060733725953b7a4cd2b53a295d11aea684/pkg/node/controller.go#L36
The issue isn’t with cleaning up the node, it is with cleaning up
node-passwdon the server.Many thanks, this solved the problem for me 👍🏻
To improve user experience kubectl should remove hostname:password from
/var/lib/rancher/k3s/server/cred/node-passwdwhen node is deleted? As it was my first time with KxS it took me a while to figure out where the password is stored and why it’s not removed. I’m happy to close it if you disagree, at least this will be some help to other users.Just to add my solution to this issue, make sure that you don’t have same hostnames for different machines. It was my case, so changing the hostname and reinstalling agent fixed the problem.
The node password is not the same as the registration token. I think it’s linked above, but please take a look at https://rancher.com/docs/k3s/latest/en/architecture/#how-agent-node-registration-works
@erikwilson I don’t see a backport PR for this into 1.19 branch. This is just for 1.20? We’re already working on shipping 1.19.5 out so I’m bumping this out.
@ibuildthecloud I ran into this issue too, and it was really confusing.
Uninstall k3s-agent / reinstall - no effect Eventually the logs of k3s-agent on the node got me here to this error.
I think I have the same issue:
"Failed to retrieve agent config: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry, try enabling...Fresh K3S installation on both nodes.
If all goes as planned, upcoming Rancher 2.5.6 will support 1.20