dogstatsd-ruby: Memory leak in 5.0.1
We are using a Datadog::Statsd object in a sidekiq worker. When the worker executes, we basically do this:
statsd = Datadog::Statsd.new("localhost", 8125)
statsd.increment(....) # specific params not included here
When upgrading from 4.8.3 to 5.0.1, we are seeing memory usage on the instance start to climb linearly until it finally exhausts all memory and we get a ThreadError: can't create Thread: Resource temporarily unavailable. We have definitively pinpointed this problem to the 5.0.1 upgrade — there were no other changes made other than upgrading just the dogstatsd-ruby gem.
You can see the mem usage problem in the graph below:
(each little dip is a deploy where we changed just one gem version. The last one is where we upgraded dogstatsd-ruby from 4.8.3 to 5.0.1).
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 4
- Comments: 24 (13 by maintainers)
Hello everyone, People having this issue should consider using the
single_threadmode recently added into v5.2.0: release note. Like v4.x versions, this single thread mode is not creating any companion thread to do the flush, which will avoid having issues for processes usingfork.@remeh I will try this and let you know soon. Thanks!
UPDATE: Actually going to wait until 5.0.2. Sorry! Priorities… 😦
@remeh Hello - so yes the error was the same
ThreadError: can't create Thread: Resource temporarily unavailable.We ended up changing the code to use the
single_thread: truemode as well as call#close(and additionally to not keep an instance of Statsd around longer than needed), and that seems to have fixed the problem.This issue should be addressed by the latest release of
ddtrace(0.51.0), as StatsD threads are not initialized anymore byddtrace.If anyone is seeing this issue in their environment, please upgrade to
ddtrace>= 0.51.0anddogstastd-ruby>= 5.2.0.ddtrace0.51.0also has new safeguards that will prevent the internal initialization of affected versions ofdogstastd-ruby(5.0.0 <= version < 5.0.0). Internal Statsd usage will be disabled with these affected versions, to prevent resource leaks. Tracing will continue as usual.Hello @mobilutz Yes, it is still present (mentioned in the CHANGELOG), we released 5.1.0 because the flush on close will solve missing metrics for some users. For current issue, I’m working on adding a single-thread mode for when users can’t create and destroy the instance during the lifecycle of a forked process, which often happens while using job libraries or other libraries heavily relying on forks, and which is most likely part of the thread leak issue. I’ll notify in this issue once it’s available.
sorry @marcotc - since this was spotted on a critical piece of infrastructure, we cannot deploy test releases in prod. I may have some time next week to setup a stripped down test case and try it there.
@deepfryed, I’ve tested my application with Puma, but still no leak.
Correct me if I’m wrong, but you still have an environment running the problematic gem versions currently. If so, and if this is feasible for you, would you be able to add some logging around the creation of the “sender” thread:
I’d recommend placing this before your
Datadog.configureblock. Feel free to modify the logging output mechanism. This logging will be verbose, so feel free to conditionally enable this in the relevant environment for your team.@remeh I use gem ddtrace that uses the
DogStatsD-rubyinternally.Only config that has is:
Probably the ddtrace not call method
closecorrectly.