k3s: Worker agent: "failed to get CA certs: EOF"

Version:

k3s version v1.17.2+k3s1 (cdab19b0)

Server was installed with curl -sfL https://get.k3s.io | sh -
Agent was installed with curl -sfL https://get.k3s.io | K3S_URL=https://controllerpi.local:6443 K3S_TOKEN=... sh -

Simple build here: two brand new RPi4’s running Raspbian Buster Lite, apt update/upgrade were run before I installed k3s. I also updated the RPi’s eeprom (with rpi-eeprom-update). Server has hostname controllerpi and worker has hostname workerpi1.

Worker won’t connect to server. Log is full of:

=info msg="Starting k3s agent v1.17.2+k3s1 (cdab19b0)"
=info msg="module overlay was already loaded"
=info msg="module br_netfilter was already loaded"
=info msg="Running load balancer 127.0.0.1:41461 -> [controllerpi.local:6443]"
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: read tcp 127.0.0.1:48874->127.0.0.1:41461: read: co
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: EOF"
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: EOF"
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: EOF"
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: EOF"
=error msg="failed to get CA certs at https://127.0.0.1:41461/cacerts: Get https://127.0.0.1:41461/cacerts: EOF"

I’ve tried using curl on the load balancer url (https://127.0.0.1:41461/cacerts), the server’s hostname (https://controllerpi.local:6443/cacerts), and the server’s IP (https://192.168.1.X:6443/cacerts). They all fail with:

curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

If I add the -k flag to curl it works in all cases (load balancer, hostname, and IP). I’m wondering if that has something to do with it.

Thanks!

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 29 (11 by maintainers)

Most upvoted comments

After trying many things, I eventually figured out the issue: when I installed the agent, I used K3S_URL=https://controllerpi.local:6443. This seemed to work fine - I can resolve controllerpi.local everywhere on my network. But, it appears the k3s agent load balancer cannot resolve that hostname. When I edited the /etc/systemd/system/k3s-agent.service.env file to change the K3S_URL to https://192.168.1.168:6443 and then restart the k3s-agent service it worked fine.

+62

bmatcuk on Feb 9, 2020

One of the things to note about the .local domain is that this is usually served by mDNS (aka Bonjour). While you might be able to resolve .local addresses from the host command line, k3s uses the golang native resolver that does not support mDNS. You can use hosts file entries, a local DNS server, or IP addresses, instead.

brandond on Oct 2, 2020

@bmatcuk do your pi’s have the correct time or were the certs possibly created when they didn’t have the correct time.

dweomer on Feb 7, 2020

After trying many things, I eventually figured out the issue: when I installed the agent, I used K3S_URL=https://controllerpi.local:6443. This seemed to work fine - I can resolve controllerpi.local everywhere on my network. But, it appears the k3s agent load balancer cannot resolve that hostname. When I edited the /etc/systemd/system/k3s-agent.service.env file to change the K3S_URL to https://192.168.1.168:6443 and then restart the k3s-agent service it worked fine.

I can across this when having the same issue relying on mDNS for host resolution. I was able to continue using DHCP and work around the issue as follows.

#K3S_URL=https://raspi-k3-controller.local:6443 #K3S_URL=https://192.168.2.70:6443 K3S_URL=https://$(getent hosts raspi-k3-controller.local | awk ‘{print $1}’ | head -1):6443

oreillymj on Jul 29, 2021

@bmatcuk - Thanks for posting your solution here; I had a similar problem, when setting up a Pi cluster today. I had changed the hostname of the master node after installing K3s, but forgot that the hostname is embedded in the agent nodes’ k3s service file config. So I had to update all those configs and make sure they were pointed at the right DNS name/IP of the master, and they all connected pretty much immediately after a service restart!

geerlingguy on May 2, 2020

If you’re running a traefik reverse proxy as your external load balancer in a HA config, this is what did the trick for me:

add loadbalancer server port label: traefik.http.services.k3s.loadbalancer.server.port=6443

add an entrypoint:

entryPoints:
  http:
    address: ":80"
  https:
    address: ":443"
  k3s:
    address: ":6443"

add TCP router and service:

tcp:
  services:
    k3s:
      loadBalancer:
        servers:
        - address: "10.240.0.11:6443"
        - address: "10.240.0.12:6443"
  routers:
    k3s:
      entryPoints:
        - "k3s"
      rule: "HostSNI(`*`)"
      tls:
        passthrough: true
      service: k3s

falterfriday on May 9, 2021

working now!! I guess it does not automatically label node with worker Role?

kubectl get nodes centos8-vm1: Tue Mar 23 14:51:29 2021

NAME STATUS ROLES AGE VERSION centos8-vm1 Ready control-plane,master 44h v1.20.4+k3s1 rpi5 Ready <none> 19s v1.20.4+k3s1 rpi4 Ready <none> 19s v1.20.4+k3s1 rpi3 Ready <none> 18s v1.20.4+k3s1 rpi2 Ready <none> 18s v1.20.4+k3s1 rpi0 Ready <none> 6h5m v1.20.4+k3s1 rpi1 Ready <none> 18s v1.20.4+k3s1

damora on Mar 23, 2021

oh, should there be :6443 at end of the server URL?

damora on Mar 23, 2021

That’s not the correct port.

brandond on Mar 23, 2021

Thanks for putting up your solution, @bmatcuk . There was an important hint in here, namely that the error could be caused by a node not being able to talk to the master. In my case I had an incorrect firewall policy that prevented some of the nodes from reaching the master, but the error thrown left me clueless. You likely saved me a lot of time today.

royolsen on Oct 2, 2020