lifecycled: [Major Bug] Memory leak causing increased CPU/Memory usage over time on idle host

Environment: I have two completely idle ec2 amazon linux 2 instances with lifecycled v3.0.1 and docker installed but no containers running. As a control, One instance is without lifecycled installed but still has docker installed with no containers running.

Behavior: See attached metrics image

htop snapshot from affected instance which shows heavy cpu/memory usage by lifecycled image

Command line used in systemd unit:

/opt/lifecycled/lifecycled --cloudwatch-group=${LIFECYCLED_CLOUDWATCH_GROUP} --handler=${LIFECYCLED_HANDLER} --sns-topic=${LIFECYCLED_SNS_TOPIC} --json

The instance with lifecycled running gradually consumes memory and CPU over time. The control instance with just docker installed remains idle/flat over time.

Expected behavior: Instance cpu and memory should remain mostly idle/flat over a length of time.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 16 (11 by maintainers)

Most upvoted comments

I think I found one more thing in the cloudwatch code. Moving the NewTimer alone did not seem to fix the issue but it does look like it slowed the increase down. I also added a patch to the cloudwatch code which seems to resolve the leak but I’m not sure if it has other side-effects. It would be good to have that confirmed with someone more familiar. I’ll send a PR for cloudwatch in a few minutes for review.

Here’s a screenshot of the lifecycled with the patched SpotListener image

Here’s a screenshot of the lifecycled with both patched SpotListener and cloudwatch code image