blackbox_exporter: ICMP probes fails continually after down and up of several target hosts, until manual restart of blackbox-exporter.
Host operating system: output of uname -a
Linux prometheus 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
blackbox_exporter version: output of blackbox_exporter -version
blackbox_exporter, version 0.12.0 (branch: HEAD, revision: 4a22506cf0cf139d9b2f9cde099f0012d9fcabde) build user: root@634195974c8e build date: 20180227-11:50:29 go version: go1.10
What is the blackbox.yml module config.
modules:
icmp:
prober: icmp
timeout: 2s
icmp:
preferred_ip_protocol: ip4
What is the prometheus.yml scrape config.
scrape_configs:
- job_name: 'icmp-ping'
metrics_path: /probe
params:
module: [icmp]
scrape_interval: 5s
scrape_timeout: 2s
file_sd_configs:
- files:
- '/etc/prometheus/targets/ping-hosts.yml'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 'prometheus.domain.zz:9115'
What logging output did you get from adding &debug=true
to the probe URL?
Logs for the probe:
ts=2018-09-19T10:54:04.147594552Z caller=main.go:116 module=icmp target=probed-host.domain.zz level=info msg="Beginning probe" probe=icmp timeout_seconds=1.5
ts=2018-09-19T10:54:04.147696813Z caller=utils.go:42 module=icmp target=probed-host.domain.zz level=info msg="Resolving target address" preferred_ip_protocol=ip4
ts=2018-09-19T10:54:04.148567095Z caller=utils.go:65 module=icmp target=probed-host.domain.zz level=info msg="Resolved target address" ip=192.168.100.49
ts=2018-09-19T10:54:04.148667279Z caller=icmp.go:71 module=icmp target=probed-host.domain.zz level=info msg="Creating socket"
ts=2018-09-19T10:54:04.14885651Z caller=icmp.go:117 module=icmp target=probed-host.domain.zz level=info msg="Creating ICMP packet" seq=61478 id=7522
ts=2018-09-19T10:54:04.148950165Z caller=icmp.go:129 module=icmp target=probed-host.domain.zz level=info msg="Writing out packet"
ts=2018-09-19T10:54:04.149184806Z caller=icmp.go:157 module=icmp target=probed-host.domain.zz level=info msg="Waiting for reply packets"
ts=2018-09-19T10:54:05.647899261Z caller=icmp.go:162 module=icmp target=probed-host.domain.zz level=warn msg="Timeout reading from socket" err="read ip 0.0.0.0: raw-read ip4 0.0.0.0: i/o timeout"
ts=2018-09-19T10:54:05.648033921Z caller=main.go:129 module=icmp target=probed-host.domain.zz level=error msg="Probe failed" duration_seconds=1.5003776850000001
Metrics that would have been returned:
# HELP probe_dns_lookup_time_seconds Returns the time taken for probe dns lookup in seconds
# TYPE probe_dns_lookup_time_seconds gauge
probe_dns_lookup_time_seconds 0.000923902
# HELP probe_duration_seconds Returns how long the probe took to complete in seconds
# TYPE probe_duration_seconds gauge
probe_duration_seconds 1.5003776850000001
# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 4
# HELP probe_success Displays whether or not the probe was a success
# TYPE probe_success gauge
probe_success 0
Module configuration:
prober: icmp
timeout: 2s
icmp:
preferred_ip_protocol: ip4
What did you do that produced an error?
Restarting openvpn client on hypervisor host, which run virtual machine with prometheus and blackbox-exporter. Blackbox-exporter target file has around 70 entries, more than 50 behind that vpn connection.
What did you expect to see?
Some failed probes during vpn restart on hypervisor and then successfull probes again.
What did you see instead?
Probes was continually fails, for ten’s of minutes, just until i manually restarted blackbox-exporter.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 21 (8 by maintainers)
Update for us here in case it helps other people googling for it:
This was caused by the payload for the blackbox icmp probe being 36 bytes. When we increased it to 64 bytes our probes were successful (using the payload_size parameter)
Well, we don’t have too much control to fix Amazon networking, i think. I you compare packets emitted by standard linux ping utility and blackbox_exporter, you will surely see difference in id header field.
Question, whether ping utility is RFC compliant or no, remains open:)
I have a similar error, though it may be caused by something else. For me, the only way I can get the icmp probe to succeed is by trying it against a target of
127.0.0.1
orlocalhost
.Any other IP address, either within the local LAN or without seems to fail.
I’ve also added the capability with:
sudo setcap cap_net_raw+ep /usr/local/bin/blackbox_exporter
And I’ve instructed systemd to run it as root.
uname -a
Linux nuc 4.13.0-1024-oem #27-Ubuntu SMP Fri Apr 13 08:27:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
blackbox_exporter version: output of blackbox_exporter -version
What is the blackbox.yml module config.
What is the prometheus.yml scrape config.
What logging output did you get from adding
&debug=true
to the probe URL?That is my systemd service: