telegraf: Memory Leak with procstat

Relevant telegraf.conf:

[global_tags]
  sc = "daf"
  p = "32"
  custom_version = "1.x"
  os_type = "win2016s"

[agent]
  interval = "1s"
  round_interval = false
  metric_batch_size = 1000
  metric_buffer_limit = 5000
  collection_jitter = "1s"
  flush_interval = "1s"
  flush_jitter = "1s"
  precision = "s"
  debug = true
  quiet = false
  logfile = "/Program Files/Telegraf/telegraf.log"
  hostname = "win2016s"
  omit_hostname = false

[[outputs.influxdb]]
  urls = [ "http://1.1.1.1:8086" ]
  database = "telegraf"
  retention_policy = "24hours"
  precision = "m"

[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.mem]]  

[[inputs.procstat]]
  interval = "1s"
  exe = ".*"
  pid_finder = "native"

[[inputs.internal]]


System info:

image

image

Steps to reproduce:

No special steps, memory leak appears to be related to procstat input plugin: image

image

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 31 (15 by maintainers)

Most upvoted comments

I have been able to reproduce this on a Windows 2016 VM running in Azure. Will update if I can find a way to reduce or eliminate the leaked memory.

any update? @danielnelson

@danielnelson Looks like datadog had the same issue with their WMI sampler https://github.com/DataDog/integrations-core/pull/3987 which tells clearly that this issue is with Windows 2016 memory leak when calling CoInitalize for each WMI query.

After reviewing the code for telegraf it seems like you rely on win_pdh library that does the actual Win32 calls, and I couldn’t find the call to the CoInitalize so I’m not sure how to help.