prometheus-operator: [kube-prometheus] cAdvisor metrics are unavailable with Kubeadm default deploy at v1.7.3+

What did you do?

Successfully installed kube-prometheus in a Kubeadm cluster v1.7.5.

What did you expect to see?

The cAdvisor endpoints in the Prometheus kubelet job working correctly.

What did you see instead? Under which circumstances?

Several metrics are gathered correctly, but not the cAdvisor ones in the kubelet job.

Environment

  • Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.4", GitCommit:"793658f2d7ca7f064d2bdf606519f9fe1229c381", GitTreeState:"clean", BuildDate:"2017-08-17T08:48:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T08:56:23Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:

Kubeadm at v1.7.5.

  • Manifests:

kube-prometheus/manifests/prometheus/prometheus-k8s-service-monitor-kubelet.yaml

[...]
spec:
  jobLabel: k8s-app
  endpoints:
  - port: http-metrics
    interval: 30s
  - port: cadvisor
    interval: 30s
    honorLabels: true

Basically, as the cAdvisor metrics have been moved, that configuration is not working anymore. The official Prometheus’ Kubernetes configuration example has already been updated with the change.

A similar configuration should be applied to the prometheus-k8s-service-monitor-kubelet.yaml manifest too, e.g.

[...]
spec:
  jobLabel: k8s-app
  endpoints:
  - port: http-metrics
    interval: 30s
  # This is for cAdvisor in K8s 1.7.3+
  - path: /metrics/cadvisor
    port: http-metrics
    interval: 30s
    honorLabels: true

It worked in my environment, but of course it might not be backward-compatible. Perhaps changing it in the kube-prometheus Helm chart (I’m assuming they are working the same way - but haven’t tested it yet) and adding a configuration property to the chart o switch the behavior might be a better option.

However, I couldn’t managed to find a way to express the cAdvisor metric link with the proxy, as done in the Prometheus official example (https://kubernetes.default.svc:443/api/v1/nodes/<NODE_NAME>/proxy/metrics/cadvisor instead of http://<NODE_IP>:10255/metrics/cadvisor which is the result of this configuration change). Since it would be useful also to access the metrics endpoint for the kube-scheduler pod (which is only available through the master-proxy in my setup), I was wondering if it is possible to build such endpoint.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 5
  • Comments: 15 (5 by maintainers)

Commits related to this issue

Most upvoted comments

@ghostflame To use the 10250/metrics/cadvisor endpoint, it should be enough to edit the kube-prometheus/manifests/prometheus/prometheus-k8s-service-monitor-kubelet.yaml manifest the way I reported it in the issue description:

change

- port: cadvisor
    interval: 30s
    honorLabels: true

to

- path: /metrics/cadvisor
    port: http-metrics
    interval: 30s
    honorLabels: true

and redeploy.

Hope this helps…

to resolve the issue in my case above, just do below on all nodes, according to https://github.com/coreos/prometheus-operator/blob/master/contrib/kube-prometheus/docs/kube-prometheus-on-kubeadm.md

sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload; systemctl restart kubelet

@lorenzo-biava as you have mentioned the correct servicemonitor configs

- path: /metrics/cadvisor
    port: http-metrics
    interval: 30s
    honorLabels: true

or

spec:
  jobLabel: k8s-app
  endpoints:
  - port: http-metrics
    interval: 30s
  # This is for cAdvisor in K8s 1.7.3+
  - path: /metrics/cadvisor
    port: http-metrics
    interval: 30s
    honorLabels: true

I did not get it work, the alarm manager still reports k8skubeletdown. Could you give some hints?

My setup is k8s v1.9.2 and master branch of prometheus-operator as of today.

[root@master1 kube-prometheus]# k describe svc kubelet
Name:              kubelet
Namespace:         kube-system
Labels:            k8s-app=kubelet
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP:                None
Port:              https-metrics  10250/TCP
TargetPort:        10250/TCP
Endpoints:         192.168.50.57:10250,192.168.50.58:10250,192.168.50.59:10250 + 8 more...
Session Affinity:  None
Events:            <none>

[root@master1 kube-prometheus]# k get svc kubelet -o yaml
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: 2018-02-09T10:09:39Z
  labels:
    k8s-app: kubelet
  name: kubelet
  namespace: kube-system
  resourceVersion: "446305"
  selfLink: /api/v1/namespaces/kube-system/services/kubelet
  uid: 57005c64-0d81-11e8-91f9-005056a3367f
spec:
  clusterIP: None
  ports:
  - name: https-metrics
    port: 10250
    protocol: TCP
    targetPort: 10250
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

[root@master1 kube-prometheus]# cat ./manifests/prometheus/prometheus-k8s-service-monitor-kubelet.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kubelet
  labels:
    k8s-app: kubelet
spec:
  jobLabel: k8s-app
  endpoints:
  #- port: https-metrics
  #  scheme: https
  #  interval: 30s
  #  tlsConfig:
  #    insecureSkipVerify: true
  #  bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  - port: https-metrics
    scheme: https
    path: /metrics/cadvisor
    interval: 30s
    honorLabels: true
    tlsConfig:
      insecureSkipVerify: true
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  selector:
    matchLabels:
      k8s-app: kubelet
  namespaceSelector:
    matchNames:
    - kube-system



image

In my case, I just let 4194 been listened, and they worked.

–cadvisor-port=0 disables cAdvisor from listening to 0.0.0.0:4194 by default. cAdvisor will still be run inside of the kubelet and its API can be accessed at https://{node-ip}:10250/stats/. If you want to enable cAdvisor to listen on a wide-open port, run:

sed -e "/cadvisor-port=0/d" -i /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
systemctl daemon-reload
systemctl restart kubelet

We’ll probably move to the /metrics/cadvisor endpoint when we upgrade everything to be target 1.8.0+.

Actually CIS requires cadvisor-port=0, so I recommend everyone to use the /metrics/cadvisor path. I’ll leave this open as we need to fix it for kube-prometheus.