prometheus-operator: All kubelet targets down - 401 Unauthorized ?

What did you do? ./contrib/kube-prometheus/hack/cluster-monitoring/deploy What did you expect to see? Everything working fine. What did you see instead? Under which circumstances? Everything is fine except kubelet on the Prometheus targets page, all are DOWN with error server returned HTTP status 401 Unauthorized Environment GKE / Ubuntu 17.10 Kubernetes version information: Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.6", GitCommit:"6260bb08c46c31eea6cb538b34a9ceb3e406689c", GitTreeState:"clean", BuildDate:"2017-12-21T06:34:11Z", GoVersion:"go1.8.3", Compiler: "gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.6-gke.0", GitCommit:"ee9a97661f14ee0b1ca31d6edd30480c89347c79", GitTreeState:"clean", BuildDate:"2018-01-05T03:36:42Z", GoVersion:"go1.8.3b4", Compiler:"gc", Platform:"linux/amd64"}

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 3
  • Comments: 19 (5 by maintainers)

Most upvoted comments

for future googlers, changing the kubelet ServiceMonitor to look for the http endpoints on port 10255 worked for me:

(prometheus-k8s-service-monitor-kubelet.yaml for the hack/ example)

port: https-metrics changes to port: http-metrics (in code here)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubelet
  labels:
    k8s-app: kubelet
spec:
  jobLabel: k8s-app
  endpoints:
  - port: http-metrics
    scheme: http
    interval: 30s
    tlsConfig:
      insecureSkipVerify: true
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  - port: http-metrics
    scheme: http
    path: /metrics/cadvisor
    interval: 30s
    honorLabels: true
    tlsConfig:
      insecureSkipVerify: true
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  selector:
    matchLabels:
      k8s-app: kubelet
  namespaceSelector:
    matchNames:
    - kube-system

In the latest helm chart, set kubelet.serviceMonitor.https=false:

kubelet:
  enabled: true
  serviceMonitor:
    https: false

This forces kubelet exporter to scrape the http-metric endpoint, which should solve the problem.

EDIT: s/http/https/

I had the same problem, using GKE. Solved it by updating the kube-prometheus-exporter-kubelets ServiceMonitor resource definition, from HTTPS to HTTP, as @vsinha suggested, with this one-liner (update the namespace accordingly):

$ kubectl -n monitoring get servicemonitor kube-prometheus-exporter-kubelets -o yaml | sed 's/https/http/' | kubectl replace -f -
servicemonitor.monitoring.coreos.com "kube-prometheus-exporter-kubelets" replaced

After this, the Prometheus server was able to scrape the kubelet target as expected.

Running on GKE, I solved a similar issue with adding --set exporter-kubelets.https=false to my helm install command. See comment in helm/exporter-kubelets/values.yaml:

# Set to false for GKE
https: true

In case anyone comes across this issue again with AKS and k8s 1.18.4. With the chart stable/prometheus-operator in version 9.2.0 removing the suggested change

- kubelet:
-   serviceMonitor:
-     https: false

fixes the issue.

Also applies to AKS, I switched the ServiceMonitor to http as workaround.

Please note the hack/ directory that you’re executing this from. The kube-prometheus stack expects you to have a properly secured setup, which allows authenticating with ServiceAccount tokens and authorizes against RBAC roles. What this concretely means for a minikube cluster for example is documented here: https://github.com/coreos/prometheus-operator/blob/master/contrib/kube-prometheus/hack/cluster-monitoring/minikube-deploy#L3-L12

Basically your cluster needs to be RBAC enabled and these two kubelet flags need to be enabled:

  • --authentication-token-webhook=true
  • --extra-config=kubelet.authorization-mode=Webhook

Feel free to ask any further questions, but this is not an issue with the Prometheus operator or kube-prometheus stack, so I’m closing this here.

Not sure if this is the right place to leave this, but adding it here as I hit a similar issue and this was the first result. I was using AWS/EKS, but I think this has more to do with k8s v1.11. It seems now the read-only port is disabled by default now. I had to re-enable this on all my nodes.

I am using a launch configuration and ASGs, so that would look something like this:

#!/bin/bash -xe

# Bootstrap and join the cluster
/etc/eks/bootstrap.sh --b64-cluster-ca  'CERT_HERE' --apiserver-endpoint 'ENDPOINT_HERE' --kubelet-extra-args '--read-only-port=10255' 'CLUSTERNAME_HERE'

after this change all worked as expected.

Another thing that made the issue more obvious that in prometheus under /targets you could see connection being refused along with data missing in grafana.

It took way to long for me to find this, so hopefully it helps someone else out.

I just confirmed that on AKS, --authentication-token-webhook is set to false the default from kubelet.

https://github.com/Azure/AKS/issues/1087

@gb-ckedzierski, since you’re modifying the kubelet config anyway, you should leave the --read-only-port disabled and use the secure port with these flags:

--authentication-token-webhook=true
--extra-config=kubelet.authorization-mode=Webhook