amazon-ssm-agent: Failed to create channel: too many open files

We had been using SSM tunnel SSH login for a week or so, but all of a sudden our servers are receiving the following error

Region: us-east-1 Ubuntu 14.04.1 LTS Instance type: c4.large amazon-ssm-agent version: 2.3.930.0

/var/log/amazon/ssm/amazon-ssm-agent.log
2020-03-27 03:05:10 ERROR [ssm-session-worker] [xxx-048848f4f625b03a0] filewatcher listener encountered error when start watcher: too many open files
2020-03-27 03:05:10 ERROR [ssm-session-worker] [xxx-048848f4f625b03a0] failed to create channel: too many open files

The server we are connecting doesn’t have many connections running, and

  1. ulimits -S & ulimits -H are both ulimited
  2. cat /etc/security/limits.conf
*               soft     nofile          500000
*               hard     nofile          500000
ubuntu          soft     nofile          500000
ubuntu          hard     nofile          500000
root            soft     nofile          500000
root            hard     nofile          500000
  1. cat /etc/sysctl.conf
fs.file-max = 500000
  1. service amazon-ssm-agent restart, service restarted correctly, but still cannot connect
  2. We have lots of lingering sessions/channels folder/files in
/var/lib/amazon/ssm/i-xxxxx/session
/var/lib/amazon/ssm/i-xxxxx/channels
/var/lib/amazon/ssm/i-xxxxx/documents
  1. regardless of the ulimit we set in our system ulimit of the process seems to be stuck at 1024
[0] ✓ root@ip-10-0-0-x:/ [01:26:28]
---> start amazon-ssm-agent
amazon-ssm-agent start/running, process 22668

[0] ✓ root@ip-10-0-0-x:/ [01:26:35]
---> cat /proc/22668/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             30038                30038                processes
Max open files            1024                 4096                 files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       30038                30038                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
  1. restart/stop amazon-ssm-agent doesn’t help, files are not deleted, same error after restsart
  2. there is no file/inode issue with our system
[0] ✓ root@ip-10-0-0-x:/var/lib/amazon/ssm/i-xxx [01:53:41]
---> df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       50G   29G   19G  62% /
none            4.0K     0  4.0K   0% /sys/fs/cgroup
udev            1.9G  8.0K  1.9G   1% /dev
tmpfs           377M  388K  377M   1% /run
none            5.0M     0  5.0M   0% /run/lock
none            1.9G     0  1.9G   0% /run/shm
none            100M     0  100M   0% /run/user

[0] ✓ root@ip-10-0-0-x:/var/lib/amazon/ssm/i-xxx [01:53:50]
---> df -ih
Filesystem     Inodes IUsed IFree IUse% Mounted on
/dev/xvda1       3.2M  514K  2.7M   17% /
none             471K     2  471K    1% /sys/fs/cgroup
udev             470K   386  469K    1% /dev
tmpfs            471K   324  471K    1% /run
none             471K     1  471K    1% /run/lock
none             471K     1  471K    1% /run/shm
none             471K     2  471K    1% /run/user

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (6 by maintainers)

Most upvoted comments

“AssociationLogsRetentionDurationHours” : 24, “RunCommandLogsRetentionDurationHours” : 336, “SessionLogsRetentionDurationHours” : 336

We keep the orchestration files for sometime before removing them, configurable in amazon-ssm-agent.json file