node-problem-detector: Crashloopbackoff on v.0.6.1 with *-counter.json

Env:

  • AWS + Kops
  • Ubuntu-16.04 hosts
  • Deployed with helm upgrade --install node-problem-detector -f node-problem-detector/helm/values/dev/values.yaml --version '1.0' --namespace=kube-system stable/node-problem-detector
  • values:
---
settings:
  log_monitors:
    - /config/abrt-adaptor.json
    - /config/docker-monitor.json
    - /config/docker-monitor-filelog.json
    - /config/docker-monitor-counter.json
    - /config/kernel-monitor.json
    - /config/kernel-monitor-filelog.json
    - /config/kernel-monitor-counter.json
    - /config/network-problem-monitor.json
    - /config/systemd-monitor-counter.json

image:
  repository: k8s.gcr.io/node-problem-detector
  tag: v0.6.1
  pullPolicy: IfNotPresent

nameOverride: "node-problem-detector"
fullnameOverride: "node-problem-detector"

rbac:
  create: true

resources: {}

annotations:
  sidecar.istio.io/inject: "false"

tolerations:
  - effect: NoSchedule
    operator: Exists
  - key: CriticalAddonsOnly
    operator: Exists

serviceAccount:
  # Specifies whether a ServiceAccount should be created
  create: true
  # The name of the ServiceAccount to use.
  # If not set and create is true, a name is generated using the fullname template
  name: "node-problem-detector"

affinity: {}

Logs:

I1221 12:13:10.771074       1 log_monitor.go:64] Finish parsing log monitor config file: {WatcherConfig:{Plugin:journald PluginConfig:map[source:abrt-notification] LogPath:/var/log/journal Lookback:5m Delay:} BufferSize:10 Source:abrt-adaptor DefaultConditions:[] Rules:[{Type:temporary Condition: Reason:CCPPCrash Pattern:Process \d+ \(\S+\) crashed in .*} {Type:temporary Condition: Reason:UncaughtException Pattern:Process \d+ \(\S+\) of user \d+ encountered an uncaught \S+ exception} {Type:temporary Condition: Reason:XorgCrash Pattern:Display server \S+ crash in \S+} {Type:temporary Condition: Reason:VMcore Pattern:System encountered a fatal error in \S+} {Type:temporary Condition: Reason:Kerneloops Pattern:System encountered a non-fatal error in \S+}]}
I1221 12:13:10.771124       1 log_watchers.go:40] Use log watcher of plugin "journald"
I1221 12:13:10.771202       1 log_monitor.go:64] Finish parsing log monitor config file: {WatcherConfig:{Plugin:journald PluginConfig:map[source:dockerd] LogPath:/var/log/journal Lookback:5m Delay:} BufferSize:10 Source:docker-monitor DefaultConditions:[] Rules:[{Type:temporary Condition: Reason:CorruptDockerImage Pattern:Error trying v2 registry: failed to register layer: rename /var/lib/docker/image/(.+) /var/lib/docker/image/(.+): directory not empty.*}]}
I1221 12:13:10.771215       1 log_watchers.go:40] Use log watcher of plugin "journald"
I1221 12:13:10.771281       1 log_monitor.go:64] Finish parsing log monitor config file: {WatcherConfig:{Plugin:filelog PluginConfig:map[message:msg="([^
]*)" timestampFormat:2006-01-02T15:04:05.999999999-07:00 timestamp:^time="(\S*)"] LogPath:/var/log/docker.log Lookback:5m Delay:} BufferSize:10 Source:docker-monitor DefaultConditions:[] Rules:[{Type:temporary Condition: Reason:CorruptDockerImage Pattern:Error trying v2 registry: failed to register layer: rename /var/lib/docker/image/(.+) /var/lib/docker/image/(.+): directory not empty.*}]}
I1221 12:13:10.771294       1 log_watchers.go:40] Use log watcher of plugin "filelog"
F1221 12:13:10.771364       1 log_monitor.go:56] Failed to unmarshal configuration file "/config/docker-monitor-counter.json": json: cannot unmarshal number into Go struct field MonitorConfig.pluginConfig of type string

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 22 (9 by maintainers)

Most upvoted comments

Hope this helps, It’s working for me in v0.6.3: my node-problem-detector.values.yaml helm chart override file uses image v0.6.3

When I had log_monitors:

  • /config/kernel-monitor-counter.json
  • /config/docker-monitor-counter.json custom_plugin_monitors: [] I’d get crashloop backoff with pod log saying Error: “Failed to unmarshal configuration file “/config/kernel-monitor-counter.json”” Error: “Failed to unmarshal configuration file “/config/docker-monitor-counter.json””

I switched it to: log_monitors: [] custom_plugin_monitors:

  • /config/kernel-monitor-counter.json
  • /config/docker-monitor-counter.json

and it worked as expected.