lifecycled: [Major Bug] Memory leak causing increased CPU/Memory usage over time on idle host
Environment: I have two completely idle ec2 amazon linux 2 instances with lifecycled v3.0.1 and docker installed but no containers running. As a control, One instance is without lifecycled installed but still has docker installed with no containers running.
Behavior:
See attached metrics

htop snapshot from affected instance which shows heavy cpu/memory usage by lifecycled

Command line used in systemd unit:
/opt/lifecycled/lifecycled --cloudwatch-group=${LIFECYCLED_CLOUDWATCH_GROUP} --handler=${LIFECYCLED_HANDLER} --sns-topic=${LIFECYCLED_SNS_TOPIC} --json
The instance with lifecycled running gradually consumes memory and CPU over time. The control instance with just docker installed remains idle/flat over time.
Expected behavior: Instance cpu and memory should remain mostly idle/flat over a length of time.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16 (11 by maintainers)
The new trace stuff is kind of amazing too: https://medium.com/@cep21/using-go-1-10-new-trace-features-to-debug-an-integration-test-1dc39e4e812d
I think I found one more thing in the cloudwatch code. Moving the NewTimer alone did not seem to fix the issue but it does look like it slowed the increase down. I also added a patch to the cloudwatch code which seems to resolve the leak but I’m not sure if it has other side-effects. It would be good to have that confirmed with someone more familiar. I’ll send a PR for cloudwatch in a few minutes for review.
Here’s a screenshot of the lifecycled with the patched SpotListener
Here’s a screenshot of the lifecycled with both patched SpotListener and cloudwatch code