node-problem-detector: health-checker posting wrong status for kubelet - `KubeletUnhealthy`
I am trying to use the health-check-monitor to monitor kubelet and docker. I built the binaries from the source and created a docker image.
Although the pod is running fine, it consistently posts -
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
KubeletUnhealthy True Fri, 03 Jul 2020 17:04:55 +0100 Fri, 03 Jul 2020 16:54:53 +0100 KubeletUnhealthy kubelet:kubelet was found unhealthy; repair flag : false
Pod logs:
I0703 16:16:01.211066 1 plugin.go:86] Start to run custom plugins
I0703 16:16:01.222034 1 plugin.go:110] Add check result {Rule:0xc00003b1f0 ExitStatus:1 Message:kubelet:kubelet was found unhealthy; repair flag : false} for rule &{Type:permanent Condition:KubeletUnhealthy Reason:KubeletUnhealthy Path:/home/kubernetes/bin/health-checker Args:[--component=kubelet --enable-repair=false --cooldown-time=1m --health-check-timeout=10s] TimeoutString:0xc0000a8a50 Timeout:3m0s}
I0703 16:16:01.222104 1 plugin.go:115] Finish running custom plugins
I0703 16:16:01.222139 1 custom_plugin_monitor.go:138] New status generated: &{Source:health-checker Events:[] Conditions:[{Type:KubeletUnhealthy Status:True Transition:2020-07-03 15:54:31.225162591 +0000 UTC m=+0.065349002 Reason:KubeletUnhealthy Message:kubelet:kubelet was found unhealthy; repair flag : false}]}
I exec to the pod:
$ kubectl exec -it node-problem-detector-kl9kj -- /bin/sh
# /home/kubernetes/bin/health-checker
I0703 16:17:36.720049 1019 health_checker.go:136] health-checker: component is unhealthy, proceeding to repair
I0703 16:17:36.720251 1019 health_checker.go:156] health-checker: executing command : &{systemctl [systemctl show kubelet --property=ActiveEnterTimestamp] [] <nil> <nil> <nil> [] <nil> <nil> <nil> 0xc00008c840 0xc000096460 false [] [] [] [] <nil> <nil>}
I0703 16:17:36.720355 1019 health_checker.go:159] health-checker: command failed : exec: "systemctl": executable file not found in $PATH, []
I0703 16:17:36.720407 1019 health_checker.go:140] health-checker: exec: "systemctl": executable file not found in $PATH
I0703 16:17:36.720453 1019 health_checker.go:142] health-checker: component uptime: 0s
kubelet:kubelet was found unhealthy; repair flag : true
On the node:
m02:/usr/local/bin$ curl -m 100 -f -s -S http://127.0.0.1:10248/healthz
ok
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 20 (3 by maintainers)
I was able to get further by mounting
/var/run/dbus/system_bus_socketand a doing aclean-install systemdon the image. I then had to enable host networking since it seems to just be doing a call to127.0.0.1:10248/healthzat https://github.com/kubernetes/node-problem-detector/blob/f42281ee2658900bdb0571e1159a43f6ab712a19/pkg/healthchecker/health_checker.go#L110.