k3s: unable to connect agent to master

Version: k3s version v1.17.3+k3s1 (5b17a175)

Describe the bug unable to join workers to the cluster

To Reproduce install k3s w/ default options on nodeA install k3s agent on nodeB using sudo /usr/local/bin/k3s agent -s https://{my_server_ip}:6443 -t <token from "/var/lib/rancher/k3s/server/node-token" on master node> --with-node-id 1

Expected behavior node B joins the cluster

Actual behavior node will not add, cannot access local proxy to the master API

Additional context No firewalls are on the system or between the two nodes (virtualized nodes, same subnet, directly exposed to LAN)

INFO[2020-03-12T17:11:30.931254304Z] Starting k3s agent v1.17.3+k3s1 (5b17a175)
INFO[2020-03-12T17:11:30.935231883Z] module overlay was already loaded
INFO[2020-03-12T17:11:30.935291956Z] module nf_conntrack was already loaded
INFO[2020-03-12T17:11:30.935301374Z] module br_netfilter was already loaded
INFO[2020-03-12T17:11:30.935629307Z] Running load balancer 127.0.0.1:36635 -> [192.168.33.10:6443]
INFO[2020-03-12T17:11:31.011412790Z] Logging containerd to /var/lib/rancher/k3s/agent/containerd/containerd.log
INFO[2020-03-12T17:11:31.011560067Z] Running containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd
INFO[2020-03-12T17:11:31.012088165Z] Waiting for containerd startup: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /run/k3s/containerd/containerd.sock: connect: no such file or directory"
INFO[2020-03-12T17:11:32.125317412Z] Updating load balancer server addresses -> [10.0.2.15:6443 192.168.33.10:6443]
INFO[2020-03-12T17:11:32.125686191Z] Connecting to proxy                           url="wss://10.0.2.15:6443/v1-k3s/connect"
ERRO[2020-03-12T17:11:32.125917574Z] Failed to connect to proxy                    error="dial tcp 10.0.2.15:6443: connect: connection refused"
ERRO[2020-03-12T17:11:32.125936712Z] Remotedialer proxy error                      error="dial tcp 10.0.2.15:6443: connect: connection refused"

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 23 (6 by maintainers)

Commits related to this issue

Most upvoted comments

ran into the same issue. This is how I solved it

create cluster master node:

export K3S_NODE_NAME=${HOSTNAME//_/-}
export K3S_EXTERNAL_IP=xx.xx.xx.xx
export INSTALL_K3S_EXEC="--docker --write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666 --tls-san $K3S_EXTERNAL_IP --kube-apiserver-arg service-node-port-range=1-65000 --kube-apiserver-arg advertise-address=$K3S_EXTERNAL_IP --kube-apiserver-arg external-hostname=$K3S_EXTERNAL_IP"
curl -sfL https://docs.rancher.cn/k3s/k3s-install.sh |  sh -

Get Token on master node:

echo -e "export K3S_TOKEN=$(cat /var/lib/rancher/k3s/server/node-token)\nexport K3S_URL=https://$K3S_EXTERNAL_IP:6443\nexport INSTALL_K3S_EXEC=\"--docker --token \$K3S_TOKEN --server \$K3S_URL\""

join workers to the cluster:

export K3S_TOKEN=xxxx
export K3S_URL=https://xx.xx.xx.xx:6443
export INSTALL_K3S_EXEC="--docker --token $K3S_TOKEN --server $K3S_URL"
export K3S_NODE_NAME=${HOSTNAME//_/-}
curl -sfL https://docs.rancher.cn/k3s/k3s-install.sh | sh -

@md2119 don’t use --kube-apiserver-arg advertise-address=<<public-ip>>.

Use --node-external-ip=<<public-ip>> instead.

@lkj4 what do the logs on your agents say?

I am facing the same issue, On master node:

curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--write-kubeconfig ~/.kube/config --write-kubeconfig-mode 666 --tls-san <<public-ip>> --kube-apiserver-arg advertise-address=<<public-ip>>" sh -

On agent node: contents of /etc/systemd/system/k3s-agent.service.env

K3S_TOKEN=<<node-token>>
K3S_URL=https://<<master-public-ip>>:6443
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="agent" sh -

Anyone who got raspberry-pis and straight up jumped into installing k3s and are having this issue - please learn from me, a long time linux user who didn’t run

sudo apt update
sudo apt upgrade

before proceeding. Save yourselves a day of googling.

@tcurdt how would you suggest we resolve this issue? It is a support request, not a defect in the software.

Nodes need to be properly configured to support the environment they are deployed in. Absent any cloud-provider-specific integrations, this includes telling them what their public IP address is, if it differs from the IP assigned to the interface. I provided an example of how to do this in the post right above yours.

Are you just here to me-too, or would you care to provide any information on how your environment is configured, what errors you’ve encountered, and what you’ve tried so far?