node_exporter: ethtool collector memory leak

It does not seem to depend on number of interfaces, as nodes without issue have similar or also some double amount if interfaces.

Host operating system: output of uname -a

Linux ip-10-149-70-38 5.4.0-1056-aws #59-Ubuntu SMP Fri Aug 20 17:35:37 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version: output of node_exporter --version

% node_exporter --version
node_exporter, version 1.2.2 (branch: HEAD, revision: 26645363b486e12be40af7ce4fc91e731a33104e)
  build user:       root@b9cb4aa2eb17
  build date:       20210806-13:44:18
  go version:       go1.16.7
  platform:         linux/amd64

node_exporter command line flags

      args:
        - --collector.ethtool
        - --collector.processes
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        - --path.rootfs=/host
        - --collector.filesystem.mount-points-exclude=^/(dev|proc|run|sys|host|var/lib/lxcfs|opt/podruntime/docker/.+|opt/podruntime/kubelet/.+)($|/)
        - --collector.netdev.device-exclude=^veth.*$
        - --collector.netclass.ignored-devices=^veth.*$

Are you running node_exporter in Docker?

yes, but as privileged container

What did you do that produced an error?

enable ethtool collector

What did you expect to see?

no memory leak

What did you see instead?

On our kubernetes control plane we see a memory leak if the only change is to enable ethtool image

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (7 by maintainers)

Most upvoted comments

Yeah I also feel that the internal notification I got about a correlation of an oom kill and the memory drop does not hold true anymore. I also tested that it is at least not always causing oomkill of the process if “it creates” too many direntry/inode cache entries. As far as I understand kernel docs, it would first try to clear the caches and then oomkill, but I am not completely sure if my understanding is true. What I can say is that not always we get oom kill if the graph in “container memory” drops.

@SuperQ ok I added this metric to our collection and let’s see what HEAD says either late today or on Monday

SVG is not supported by GH, but here is a png: heap

I will check how I can build HEAD in our case.