k3s: GET nodes.metrics.k8s.io fails
Hardware Raspberry Pi 4 8GB RAM (Buster Lite OS)
Version: v1.18.4+k3s1
K3S arguments Server: --docker --no-deploy=traefik Agent: --docker
Describe the bug
Fresh installation of k3s, and run kubectl top nodes getting error from API 503 service not available.
To Reproduce
Install k3s using k3s-ansible with the specified version
Expected behavior
I’d expected to see metrics of nodes
Actual behavior
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Additional context / logs
eden@eden ~> env KUBECONFIG=/home/eden/.kube/production kubectl -v5 top nodes
I0628 23:15:25.094137 17258 helpers.go:216] server response object: [{
"metadata": {},
"status": "Failure",
"message": "the server is currently unable to handle the request (get nodes.metrics.k8s.io)",
"reason": "ServiceUnavailable",
"details": {
"group": "metrics.k8s.io",
"kind": "nodes",
"causes": [
{
"reason": "UnexpectedServerResponse",
"message": "service unavailable"
}
]
},
"code": 503
}]
F0628 23:15:25.094215 17258 helpers.go:115] Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Systemd Server:
k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2020-06-28 12:09:19 BST; 10h ago
Docs: https://k3s.io
Process: 614 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 621 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 622 (k3s-server)
Tasks: 28
Memory: 420.3M
CGroup: /system.slice/k3s.service
└─622 /usr/local/bin/k3s server --docker --no-deploy traefik
Jun 28 22:51:10 k8s-master k3s[622]: E0628 22:51:10.173721 622 resource_quota_controller.go:408] unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Jun 28 22:51:10 k8s-master k3s[622]: time="2020-06-28T22:51:10.276193535+01:00" level=error msg="node password not set"
Jun 28 22:51:10 k8s-master k3s[622]: time="2020-06-28T22:51:10.276651752+01:00" level=error msg="https://127.0.0.1:6443/v1-k3s/serving-kubelet.crt: 500 Internal Server Error"
Jun 28 22:51:10 k8s-master k3s[622]: E0628 22:51:10.759934 622 available_controller.go:420] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.37.24:443/apis/metrics.k8s.io/v1beta1: Get https://10.43.37.24:443/apis/metrics.k8s.io/v1beta1: net/htt
Jun 28 22:51:10 k8s-master k3s[622]: time="2020-06-28T22:51:10.765030642+01:00" level=info msg="Waiting for master node startup: resource name may not be empty"
Jun 28 22:51:11 k8s-master k3s[622]: time="2020-06-28T22:51:11.765333066+01:00" level=info msg="Waiting for master node startup: resource name may not be empty"
Jun 28 22:51:12 k8s-master k3s[622]: time="2020-06-28T22:51:12.765718119+01:00" level=info msg="Waiting for master node startup: resource name may not be empty"
Jun 28 22:51:13 k8s-master k3s[622]: time="2020-06-28T22:51:13.766206430+01:00" level=info msg="Waiting for master node startup: resource name may not be empty"
Jun 28 22:51:14 k8s-master k3s[622]: time="2020-06-28T22:51:14.768944604+01:00" level=info msg="Waiting for master node startup: resource name may not be empty"
Jun 28 22:51:15 k8s-master k3s[622]: http: TLS handshake error from 127.0.0.1:43918: remote error: tls: bad certificate
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 17 (3 by maintainers)
I’m not seeing IP addresses. I’m seeing repeated errors trying to connect to FQDNs, like v1beta1.metrics.k8s.io
Why use a subdomain of a registered internet domain for RFC1918 traffic? That doesn’t telegraph ‘local address lookup’, let alone local/vlan traffic.
I’ve disabled the
metrics-servera long time ago because it just doesn’t work, I’ve gave it another try tonight and finally make it work!My configuration:
v1.18.8+k3s1v1.18.8+k3s1First of all, after an accurate reading of kubernetes documentation I’ve ended adding
enable-aggregator-routing=trueflag on api-server. Here’s my master’s configuration (beware I’ve also enabled pod security policy, you might not want it :p )I’ve disabled the provided
metrics-serverin order to use the “official” one.So, starting from the official deployment, I’ve added the followings
args:v=2seems important, because when I add it, I got interesting logs from the pod.And finally, I’ve added
hostNetwork: trueon the deployment, and after 2 minutes, I’ve gotkubectl top podsworking!I don’t understand why k3s is contacting random internet servers by default. When apps phone home it tends to make people unpleasant to talk to. If you’re lucky they just uninstall it and move on.
I just got here and “leave?” is already on my TODO list.
Please avoid using
--disable-agent, it will probably cause more problems than it will fix. The order that network interfaces come up may be important, especially since k8s uses iptables. If you have multiple network interfaces please ensure that--flannel-ifacepoints to the interface where nodes have shared networking. For something like ipsec there may be a lower level networking issue that needs to be resolved.Just wanted to add that I managed to fix this finally. It was a host network issue, where the floating IP that was set for some reason conflicted with the host IP of the node. Using Ubuntu 20.04 and Netplan I had to set the host IP BEFORE the floating IP to not cause some kind of internal routing issue within Kubernetes/k3s. Never managed to figure it out because, but this simple solution fixed all my problems.
@jdmarshall It sounds like you’re under the impression that
https://10.43.37.24:443/apis/metrics.k8s.io/v1beta1is a server on the internet. All 10.x.x.x addresses, like 192.168.x.x and 172.16.x.x-172.31.x.x are reserved for private networks that you will not (or at least should not) find on the internet at large. See: https://tools.ietf.org/html/rfc1918In this case, 10.43.x.x is used for Kubernetes services running within your cluster, while 10.42.x.x is used for Kubernetes pods. None of this is k3s specific; it’s core to how Kubernetes works. If you’re having an issue with your k3s cluster, please open a new issue - but perhaps try to avoid jumping to any conclusions about what k3s is or is not doing.
@ViBiOh awesome, your solution works, I can finally get some pods and nodes output 😃) thanks!!
ping on the issue - question remains why it does not work out of the box when installing k3s latest version ? any ideas ? I think this deserve a further investigation… Perhaps we can provide this flags to the default metrics-server ?