prometheus-operator: Kube-prometheus etcd ServiceMonitor not working
What did you do? Deploy kube-prometheus in with the kube-etcd exporter (Service and ServiceMonitor)
What did you expect to see? Green alerts in prometheus dashboard for kube-etc related metrics.
What did you see instead? Under which circumstances?
InsufficientMembers is not passing.
up{job="kube-etcd"} query on Prometheus returns 0 for all kube-etcd servers.
- 
Prometheus Operator version: quay.io/coreos/prometheus-operator@sha256:88cd66e273db8f96cfcce2eec03c04b04f0821f3f8d440396af2b5510667472d 
- 
Kubernetes version information: 
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.5", GitCommit:"32ac1c9073b132b8ba18aa830f46b77dcceb0723", GitTreeState:"clean", BuildDate:"2018-06-21T11:46:00Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.6", GitCommit:"a21fdbd78dde8f5447f5f6c331f7eb6f80bd684e", GitTreeState:"clean", BuildDate:"2018-07-26T10:04:08Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
- 
Kubernetes cluster kind: Kops cluster, Version 1.10.0 (git-8b52ea6d1) 
- 
Manifests: 
Service Monitor (created by kube-prometheus Chart)
Name:         kube-prometheus-exporter-kube-etcd
Namespace:    monitoring
Labels:       app=exporter-kube-etcd
              chart=exporter-kube-etcd-0.1.15
              component=kube-etcd
              heritage=Tiller
              prometheus=kube-prometheus
              release=kube-prometheus
Annotations:  <none>
API Version:  monitoring.coreos.com/v1
Kind:         ServiceMonitor
Metadata:
  Cluster Name:        
  Creation Timestamp:  2018-12-01T18:19:18Z
  Generation:          1
  Resource Version:    25504678
  Self Link:           /apis/monitoring.coreos.com/v1/namespaces/monitoring/servicemonitors/kube-prometheus-exporter-kube-etcd
  UID:                 9e4eaaed-f595-11e8-b63f-027656b86196
Spec:
  Endpoints:
    Bearer Token File:  /var/run/secrets/kubernetes.io/serviceaccount/token
    Interval:           15s
    Port:               http-metrics
  Job Label:            component
  Namespace Selector:
    Match Names:
      kube-system
  Selector:
    Match Labels:
      App:        exporter-kube-etcd
      Component:  kube-etcd
Events:           <none>
Service (created by kube-prometheus Chart)
Name:              kube-prometheus-exporter-kube-etcd
Namespace:         kube-system
Labels:            app=exporter-kube-etcd
                   chart=exporter-kube-etcd-0.1.15
                   component=kube-etcd
                   heritage=Tiller
                   release=kube-prometheus
Annotations:       <none>
Selector:          k8s-app=etcd-server
Type:              ClusterIP
IP:                None
Port:              http-metrics  4001/TCP
TargetPort:        4001/TCP
Endpoints:         <redacted>:4001,<redacted>:4001,<redacted>:4001
Session Affinity:  None
Events:            <none>
Etcd server pod labels (Created by Kops)
Labels:       k8s-app=etcd-server
- Prometheus Operator Logs: Not sure if these are relevant. Restarted the Pod to get logs different than “sync alertmanager” and “sync prometheus”, which are the only logs after a few hours of runtime.
evel=info ts=2018-12-09T06:41:48.028636972Z caller=operator.go:292 component=prometheusoperator msg="connection established" cluster-version=v1.10.6
level=info ts=2018-12-09T06:41:48.031017419Z caller=operator.go:172 component=alertmanageroperator msg="connection established" cluster-version=v1.10.6
level=info ts=2018-12-09T06:41:48.16309026Z caller=operator.go:560 component=alertmanageroperator msg="CRD updated" crd=Alertmanager
level=info ts=2018-12-09T06:41:48.165255387Z caller=operator.go:1132 component=prometheusoperator msg="CRD updated" crd=Prometheus
level=info ts=2018-12-09T06:41:48.174046849Z caller=operator.go:1132 component=prometheusoperator msg="CRD updated" crd=ServiceMonitor
level=info ts=2018-12-09T06:41:48.183440145Z caller=operator.go:1132 component=prometheusoperator msg="CRD updated" crd=PrometheusRule
level=info ts=2018-12-09T06:41:51.170636266Z caller=operator.go:186 component=alertmanageroperator msg="CRD API endpoints ready"
level=info ts=2018-12-09T06:41:51.173120069Z caller=operator.go:396 component=alertmanageroperator msg="sync alertmanager" key=monitoring/kube-prometheus
E1209 06:41:51.204133       1 operator.go:272] Sync "monitoring/kube-prometheus" failed: creating statefulset failed: statefulsets.apps "alertmanager-kube-prometheus" already exists
level=info ts=2018-12-09T06:41:51.204934663Z caller=operator.go:396 component=alertmanageroperator msg="sync alertmanager" key=monitoring/kube-prometheus
level=info ts=2018-12-09T06:41:51.220686111Z caller=operator.go:396 component=alertmanageroperator msg="sync alertmanager" key=monitoring/kube-prometheus
level=info ts=2018-12-09T06:41:57.208163155Z caller=operator.go:306 component=prometheusoperator msg="CRD API endpoints ready"
level=info ts=2018-12-09T06:41:57.216490069Z caller=operator.go:731 component=prometheusoperator msg="sync prometheus" key=monitoring/kube-prometheus
level=info ts=2018-12-09T06:41:57.457037411Z caller=operator.go:731 component=prometheusoperator msg="sync prometheus" key=monitoring/kube-prometheus
level=info ts=2018-12-09T06:41:57.682665364Z caller=operator.go:731 component=prometheusoperator msg="sync prometheus" key=monitoring/kube-prometheus
level=info ts=2018-12-09T06:41:57.73801164Z caller=operator.go:731 component=prometheusoperator msg="sync prometheus" key=monitoring/kube-prometheus
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 3
- Comments: 27 (2 by maintainers)
Ok I managed to get etcd scrape working with
etcd-managerand kops 1.12You can create the
Secretneeded for the operator to start properly like this:Once you have the
Secretyou can install the operator with the followingvalues.xmlsnipped (only etcd relevant part included):Last but not least, you have to update the service
prometheus-prometheus-oper-kube-etcdwhich the operator creates to monitor etcd removing the selectorcomponent: etcdso that itsspeclooks this way:This ends up with etcd scraping working for me
Thank you @irizzant, worked for me too on
kops 1.12and withprometheus-operator-5.12.4 helm chart. I would like to extend it a bit. We also need to add the generatedetcd-certssecret to the prometheus-operatorsvalues.yamland we can remove in advance thecomponent: etcdfrom the generated service by addingcomponent: null@stale Still interested in a better solution for this.
@mediaimprove @jesse-welch kops doesn’t allow nodes to access the masters on port 4001 (etcd client port) for security reasons, see your EC2 security group for the masters.