node_exporter: Systemd service metrics missing when loaded but disabled

Host operating system: output of uname -a

3.10.0-862.11.6.el7.x86_64 Red Hat Enterprise Linux Server release 7.5 (Maipo)

node_exporter version: output of node_exporter --version

node_exporter, version 0.16.0 (branch: HEAD, revision: d42bd70f4363dced6b77d8fc311ea57b63387e4f) build user: root@a67a9bc13a69 build date: 20180515-15:52:42 go version: go1.9.6

node_exporter command line flags

ExecStart=/home/prometheus/node_exporter/node_exporter --collector.systemd

Are you running node_exporter in Docker?

No

What did you do that produced an error?

Have a custom systemd service defined in /etc/systemd/system for the Keepalived daemon. Running the following query returns the expected results with all five defined states: node_systemd_unit_state{instance=“x.x.x.x:9100”,name=“keepalived.service”}

node_systemd_unit_state{instance=“192.245.221.215:9100”,job=“node2”,name=“keepalived.service”,state=“activating”} 0 node_systemd_unit_state{instance=“192.245.221.215:9100”,job=“node2”,name=“keepalived.service”,state=“active”} 1 node_systemd_unit_state{instance=“192.245.221.215:9100”,job=“node2”,name=“keepalived.service”,state=“deactivating”} 0 node_systemd_unit_state{instance=“192.245.221.215:9100”,job=“node2”,name=“keepalived.service”,state=“failed”} 0 node_systemd_unit_state{instance=“192.245.221.215:9100”,job=“node2”,name=“keepalived.service”,state=“inactive”} 0

Once i issue a sudo systemctl stop keepalived.service and run the query again, then the prometheus returns nothing. It is as if the service was never defined. I run the query and don’t filter on job name and every other service is returned. Once I start the service, the metrics will return again.

What did you expect to see?

Expected to continue to see the states returned, but with Active=0 and Inactive=1

What did you see instead?

No metrics for the service were returned period. Nothing. Blank screen. I have an a secondary server which I thought was an exact mirror image of the server exhibiting the issue and it does not experience this problem. Thank you for everyone’s time.

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 3
  • Comments: 40 (18 by maintainers)

Most upvoted comments

Yes, I haven’t found a way to get systemd to “keep a reference to the unit” without enabling foo.service or creating a target that wants foo.service, in both cases foo.service will get started when I don’t want.

@mlushpenko The best option is to have a Prometheus /metrics endpoint on the service. This provides both the blackbox check and service status, eliminating the need for watching systemd at all. 😄