amazon-ecs-agent: Too many open files
Yesterday we upgraded our cluster from amzn-ami-2016.03.c-amazon-ecs-optimized to the latest, amzn-ami-2016.03.g-amazon-ecs-optimized. At some point overnight, two of the instances in our cluster (out of ~6 in ASG) began flooding logs of this nature (hundreds per second):
Aug 17 07:21:17 Seelog error: open /log/ecs-agent.log.2016-08-17-14: too many open files
Aug 17 07:21:17 2016-08-17T14:21:17Z [WARN] Error retrieving stats for container bcbd3d6d2a51f656ec2066e62296010f5432262e1a564678325a60f3e642a575: dial unix /var/run/docker.sock: socket: too many open files
The two instances terminated without human interaction (not sure if that’s a coincidence of auto-scaling). Near the end, these logs also appeared:
Aug 17 07:21:17 2016-08-17T14:21:17Z [CRITICAL] Error saving state before final shutdown module="TerminationHandler" err="Multiple error:
Aug 17 07:21:17 0: Timed out waiting for TaskEngine to settle
Aug 17 07:21:17 1: Timed out trying to save to disk"
We haven’t experienced this on previous AMIs.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 4
- Comments: 20 (6 by maintainers)
@ziggythehamster The new ECS AMI is
amzn-ami-2016.03.h-amazon-ecs-optimized
. We’ll be updating our documentation shortly.Any chance of an ETA on 1.12.1?
We’ve just released 1.12.1, which should fix this issue. Please let us know if you continue to run into problems.