metrics-server: unable to fetch pod metrics for pod - x509: certificate signed by unknown
Current setup I’m running Kubernetes 1.9 which has been setup with Kops 1.9, currently running heapster and I which to migrate to the new metrics-server in order to expose CPU/Memory/Storage metrics to be used by HPA.
What version am I running?
kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.1", GitCommit:"d4ab47518836c750f9949b9e0d387f20fb92260b", GitTreeState:"clean", BuildDate:"2018-04-13T22:29:03Z", GoVersion:"go1.9.5", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T20:55:30Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Whats the issue?
After installing metrics-server using the following instructions here: https://github.com/kubernetes-incubator/metrics-server (kubectl create -f deploy/1.8+/
)
I’m seeing the following errors in the metrics-server pod logs:
E0903 15:56:05.116049 1 reststorage.go:98] unable to fetch pod metrics for pod x/geodrive-server-1562483932-xb2kq: no metrics known for pod "x/geodrive-server-1562483932-xb2kq" E0903 15:56:05.116075 1 reststorage.go:98] unable to fetch pod metrics for pod x/geodrive-server-1562483932-ljxn2: no metrics known for pod "x/geodrive-server-1562483932-ljxn2" E0903 15:56:07.506634 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-31-104-76.us-west-2.compute.internal: unable to fetch metrics from Kubelet ip-172-31-104-76.us-west-2.compute.internal (ip-172-31-104-76.us-west-2.compute.internal): Get https://ip-172-31-104-76.us-west-2.compute.internal:10250/stats/summary/: x509: certificate signed by unknown authority, unable to fully scrape metrics from source kubelet_summary:ip-172-31-109-239.us-west-2.compute.internal: unable to fetch metrics from Kubelet ip-172-31-109-239.us-west-2.compute.internal (ip-172-31-109-239.us-west-2.compute.internal): Get https://ip-172-31-109-239.us-west-2.compute.internal:10250/stats/summary/: x509: certificate signed by unknown
kubectl describe hpa
Name: x-nexpresso Namespace: x Labels: <none> Annotations: <none> CreationTimestamp: Mon, 03 Sep 2018 17:17:51 +0300 Reference: Deployment/x-nexpresso Metrics: ( current / target ) resource cpu on pods (as a percentage of request): <unknown> / 80% Min replicas: 1 Max replicas: 64 Conditions: Type Status Reason Message AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from heapster Events: Type Reason Age From Message Warning FailedGetResourceMetric 7m (x215 over 2h) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from heapster Warning FailedComputeMetricsReplicas 2m (x225 over 2h) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from heapster
What do I expect?
I was under the impression that since 1.9 --horizontal-pod-autoscaler-use-rest-clients=true
is default and once installing metrics-server it should take place and heapster won’t work anymore, but now none of them is working. What am I missing?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 64 (22 by maintainers)
Closing per Kubernetes issue triage policy
GitHub is not the right place for support requests. If you’re looking for help, check Stack Overflow and the troubleshooting guide. You can also post your question on the Kubernetes Slack or the Discuss Kubernetes forum. If the matter is security related, please disclose it privately via https://kubernetes.io/security/.
@DirectXMan12 I’m using a default setup. My certs have been generated, I assume, during my cluster setup with Kops (version 1.9). So whatever happened behind the scene is completely transparent to me. I think the default behavior should be to trust the certs if not stated otherwise. Launching metrics server should be easier than this, in my opinion. At the moment, to fix this, what would be the best approach? Where and what should I modify? Since we didn’t configure kubelet explicitly is it manageable directly with Kops?
Thanks,
I added
--kubelet-insecure-tls
and it fixed it. The thing I don’t understand is whats the difference betweeninsecureSkipTLSVerify: true
in themetrics-apiservice.yaml
file and thekubelet-insecure-tls
flag. My guess is that this flag has to do with the resource created on the api side ( master node ) and--kubelet-insecure-tls
is on the client running the metrics-server? Anyways, for anyone out there struggling with this, just modify themetrics-server-deployment.yaml
file with the following:Any clue about how to solve it in a Kubernetes cluster deployed with KOPS v1.10.0 and metrics-server v0.3.1?
I have the same issue:
E1126 15:44:34.653842 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-20-68-29.ec2.internal: unable to fetch metrics from Kubelet ip-172-20-68-29.ec2.internal (172.20.68.29): Get https://172.20.68.29:10250/stats/summary/: x509: cannot validate certificate for 172.20.68.29 because it doesn't contain any IP SANs
I am using
--kubelet-insecure-tls
and v0.3.1 but still see the same error:Our kubelets are set up with kubeadms
kubeadm join
command which - as far as I understand - is using the CSR API. We have not generated any certificates manually, everything was set up by kubeadm.I’m seeing the same issue when running metrics-server 0.3.0 from the Helm chart anyway. Kubernetes/kubeadm version is
v1.10.6
. If there is anything I can contribute to dig deeper into this issue I would love to provide necessary data.@serathius why closed?
Did you enable
authenticationTokenWebhook
?We had to do this on some clusters. In kops you do this by adding the following to the spec:
I have also got the same issue and fixed it by creating a another clusterrole for pod,nodes resources to get,list,watch verbs and clusterrolebinding with system:anonymous user.
In this moment I’m using:
--kubelet-insecure-tls
flagk8s.gcr.io/metrics-server-amd64:v0.3.1
versionAnd it is working, but I want to use it with TLS activated
@DirectXMan12 Can you maybe show the steps for signing and configuring the kubelet certificates for minikube?
Update: I tried the following to get a serving certificate for the kubelet:
There is no certificate error anymore, but something else:
Update2: That issue was already discussed in: https://github.com/kubernetes-incubator/metrics-server/issues/95 So, here the final solution: \o/
@DirectXMan12 I believe this is the answer:
@PierluigiLenociAkelius port 10255 is disabled in EKS. So nope it is not working.
adding
-extra-config=kubelet.authentication-token-webhook=true
to myminikube start
command worked.@olemarkus I can confirm that your advice helps for kops cluster (running 1.12.7 k8s)
For getting
metrics-server
in minikube, minikube is supporting the addons fo it!! You can simply run the following commandminikube addons enable metrics-server
The metrics-server could able to collect the metrics only when I added the clusterroles & clusterrolebindings mentioned at https://github.com/kubernetes-incubator/metrics-server/issues/133#issuecomment-478525227For those on Kops, I have a PR with what you need to get it working. See https://github.com/kubernetes/kops/pull/6201.
Hi @DirectXMan12 I’m a bit puzzled on the conclusions of this. Do you mind summarize next steps on this? Many thanks.