metrics-server: unable to fetch pod metrics for pod - x509: certificate signed by unknown

Current setup I’m running Kubernetes 1.9 which has been setup with Kops 1.9, currently running heapster and I which to migrate to the new metrics-server in order to expose CPU/Memory/Storage metrics to be used by HPA.

What version am I running? kubectl version

Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.1", GitCommit:"d4ab47518836c750f9949b9e0d387f20fb92260b", GitTreeState:"clean", BuildDate:"2018-04-13T22:29:03Z", GoVersion:"go1.9.5", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T20:55:30Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}

Whats the issue? After installing metrics-server using the following instructions here: https://github.com/kubernetes-incubator/metrics-server (kubectl create -f deploy/1.8+/) I’m seeing the following errors in the metrics-server pod logs:

E0903 15:56:05.116049 1 reststorage.go:98] unable to fetch pod metrics for pod x/geodrive-server-1562483932-xb2kq: no metrics known for pod "x/geodrive-server-1562483932-xb2kq" E0903 15:56:05.116075 1 reststorage.go:98] unable to fetch pod metrics for pod x/geodrive-server-1562483932-ljxn2: no metrics known for pod "x/geodrive-server-1562483932-ljxn2" E0903 15:56:07.506634 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-31-104-76.us-west-2.compute.internal: unable to fetch metrics from Kubelet ip-172-31-104-76.us-west-2.compute.internal (ip-172-31-104-76.us-west-2.compute.internal): Get https://ip-172-31-104-76.us-west-2.compute.internal:10250/stats/summary/: x509: certificate signed by unknown authority, unable to fully scrape metrics from source kubelet_summary:ip-172-31-109-239.us-west-2.compute.internal: unable to fetch metrics from Kubelet ip-172-31-109-239.us-west-2.compute.internal (ip-172-31-109-239.us-west-2.compute.internal): Get https://ip-172-31-109-239.us-west-2.compute.internal:10250/stats/summary/: x509: certificate signed by unknown

kubectl describe hpa

Name: x-nexpresso Namespace: x Labels: <none> Annotations: <none> CreationTimestamp: Mon, 03 Sep 2018 17:17:51 +0300 Reference: Deployment/x-nexpresso Metrics: ( current / target ) resource cpu on pods (as a percentage of request): <unknown> / 80% Min replicas: 1 Max replicas: 64 Conditions: Type Status Reason Message AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from heapster Events: Type Reason Age From Message Warning FailedGetResourceMetric 7m (x215 over 2h) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from heapster Warning FailedComputeMetricsReplicas 2m (x225 over 2h) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from heapster

What do I expect? I was under the impression that since 1.9 --horizontal-pod-autoscaler-use-rest-clients=true is default and once installing metrics-server it should take place and heapster won’t work anymore, but now none of them is working. What am I missing?

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 64 (22 by maintainers)

Commits related to this issue

Most upvoted comments

Closing per Kubernetes issue triage policy

GitHub is not the right place for support requests. If you’re looking for help, check Stack Overflow and the troubleshooting guide. You can also post your question on the Kubernetes Slack or the Discuss Kubernetes forum. If the matter is security related, please disclose it privately via https://kubernetes.io/security/.

@DirectXMan12 I’m using a default setup. My certs have been generated, I assume, during my cluster setup with Kops (version 1.9). So whatever happened behind the scene is completely transparent to me. I think the default behavior should be to trust the certs if not stated otherwise. Launching metrics server should be easier than this, in my opinion. At the moment, to fix this, what would be the best approach? Where and what should I modify? Since we didn’t configure kubelet explicitly is it manageable directly with Kops?

Thanks,

I added --kubelet-insecure-tls and it fixed it. The thing I don’t understand is whats the difference between insecureSkipTLSVerify: true in the metrics-apiservice.yaml file and the kubelet-insecure-tls flag. My guess is that this flag has to do with the resource created on the api side ( master node ) and --kubelet-insecure-tls is on the client running the metrics-server? Anyways, for anyone out there struggling with this, just modify the metrics-server-deployment.yaml file with the following:

      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        imagePullPolicy: Always
        args: ["--kubelet-insecure-tls"]
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp

Any clue about how to solve it in a Kubernetes cluster deployed with KOPS v1.10.0 and metrics-server v0.3.1?

I have the same issue: E1126 15:44:34.653842 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-172-20-68-29.ec2.internal: unable to fetch metrics from Kubelet ip-172-20-68-29.ec2.internal (172.20.68.29): Get https://172.20.68.29:10250/stats/summary/: x509: cannot validate certificate for 172.20.68.29 because it doesn't contain any IP SANs

I am using --kubelet-insecure-tls and v0.3.1 but still see the same error:

E0228 09:42:47.908628       1 reststorage.go:144] unable to fetch pod metrics for pod default/iconverse-connector-3: no metrics known for pod
E0228 09:43:03.038755       1 reststorage.go:129] unable to fetch node metrics for node "ip-192-168-44-83.ap-southeast-1.compute.internal": no metrics known for node
E0228 09:43:12.543807       1 manager.go:102] unable to fully collect metrics: unable to extract connection information for node "ip-192-168-44-83.ap-southeast-1.compute.internal": node ip-192-168-44-83.ap-southeast-1.compute.internal is not ready

It depends on how you’ve set up your kubelet. If you’re using the CSR API, I believe it should just work out of the box, otherwise, when you generate your certs, you’ll have to sign them with the same CA used to sign the API server serving certs, etc. Alternatively, if you’re explicitly using a different CA on purpose, we should probably just have an option to trust that instead.

Our kubelets are set up with kubeadms kubeadm join command which - as far as I understand - is using the CSR API. We have not generated any certificates manually, everything was set up by kubeadm.

I’m seeing the same issue when running metrics-server 0.3.0 from the Helm chart anyway. Kubernetes/kubeadm version is v1.10.6. If there is anything I can contribute to dig deeper into this issue I would love to provide necessary data.

Did you enable authenticationTokenWebhook?

We had to do this on some clusters. In kops you do this by adding the following to the spec:

  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook

I have my kubernetes setup done via KOPS and to implemet the hpa, I started testing in my local minikube setup. Un fortunately, I couldn’t succeed, I have gone tfrough many issues and figured out to use --kubelet-insecure-tls but no luck

kubectl logs -f metrics-server-6f866897cf-xjl9t -n kube-system

I0401 10:02:59.571797 1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key) [restful] 2019/04/01 10:02:59 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi [restful] 2019/04/01 10:02:59 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/ I0401 10:02:59.914600 1 serve.go:96] Serving securely on [::]:443 E0401 10:03:59.917615 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:minikube: unable to fetch metrics from Kubelet minikube (10.0.2.15): request failed - "403 Forbidden", response: "Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=stats)" E0401 10:04:59.898478 1 manager.go:102] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:minikube: unable to fetch metrics from Kubelet minikube (10.0.2.15): request failed - "403 Forbidden", response: "Forbidden (user=system:anonymous, verb=get, resource=nodes, subresource=stats)" E0401 10:05:04.293202 1 reststorage.go:144] unable to fetch pod metrics for pod default/php-apache-84cc7f889b-g65p6: no metrics known for pod E0401 10:05:19.333587 1 reststorage.go:144] unable to fetch pod metrics for pod default/php-

Now I tried also doing the same in my cluster which is setup via KOPS, I still see same issue 😦

Let me know if i’m going wrong somewhere, thanks!

I have also got the same issue and fixed it by creating a another clusterrole for pod,nodes resources to get,list,watch verbs and clusterrolebinding with system:anonymous user.

cat << EOF > view-metrics.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: view-metrics
rules:
- apiGroups:
    - metrics.k8s.io
  resources:
    - pods
    - nodes
  verbs:
    - get
    - list
    - watch
EOF
kubectl apply -f view-metrics.yaml
cat <<EOF > view-metrics-cluster-role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: view-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: view-metrics
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: system:anonymous
EOF
kubectl apply -f view-metrics-cluster-role-binding.yaml

@richstokes @luarx did you guys install metrics server as following:

In this moment I’m using:

  • --kubelet-insecure-tls flag
  • k8s.gcr.io/metrics-server-amd64:v0.3.1 version

And it is working, but I want to use it with TLS activated

@DirectXMan12 Can you maybe show the steps for signing and configuring the kubelet certificates for minikube?

Update: I tried the following to get a serving certificate for the kubelet:

minikube delete ; minikube start --vm-driver kvm2 \
  --bootstrapper=kubeadm  \
  --kubernetes-version=v1.12.2 \
  --extra-config=kubelet.rotate-server-certificates=true

kubectl certificate approve csr-t6szc

kubectl apply -f ./deploy/1.8+

There is no certificate error anymore, but something else:

E1110 23:49:47.914499       1 manager.go:102] unable to fully collect metrics: 
  unable to fully scrape metrics from source kubelet_summary:minikube: 
  unable to fetch metrics from Kubelet minikube (minikube): 
  request failed - "403 Forbidden", response: 
  "Forbidden (user=system:anonymous, 
              verb=get, 
              resource=nodes, 
              subresource=stats)"

Update2: That issue was already discussed in: https://github.com/kubernetes-incubator/metrics-server/issues/95 So, here the final solution: \o/

minikube delete ; minikube start --vm-driver kvm2 --bootstrapper=kubeadm --kubernetes-version=v1.12.2 \
  --extra-config=kubelet.authentication-token-webhook=true \
  --extra-config=kubelet.rotate-server-certificates=true

kubectl certificate approve csr-r4nd0m

kubectl apply -f ./deploy/1.8+

@DirectXMan12 I believe this is the answer:

TLS bootstrapping only sets up api client certificates for the kubelet currently. If you want a serving cert for the kubelet that is signed by the apiserver’s --kubelet-certificate-authority you must provide it. Otherwise the kubelet generates a self-signed serving cert.

@PierluigiLenociAkelius port 10255 is disabled in EKS. So nope it is not working.

adding -extra-config=kubelet.authentication-token-webhook=true to my minikube start command worked.

@olemarkus I can confirm that your advice helps for kops cluster (running 1.12.7 k8s)

For getting metrics-server in minikube, minikube is supporting the addons fo it!! You can simply run the following command minikube addons enable metrics-server The metrics-server could able to collect the metrics only when I added the clusterroles & clusterrolebindings mentioned at https://github.com/kubernetes-incubator/metrics-server/issues/133#issuecomment-478525227

For those on Kops, I have a PR with what you need to get it working. See https://github.com/kubernetes/kops/pull/6201.

Hi @DirectXMan12 I’m a bit puzzled on the conclusions of this. Do you mind summarize next steps on this? Many thanks.