amazon-eks-ami: Raise docker default ulimit for nofile to 65535

In the latest AMI version, v20190327, in the file /etc/sysconfig/docker the file ulimit is set to 4096:

OPTIONS="--default-ulimit nofile=1024:4096"

We’ve already hit this limit with some java applications and have raised the limit to 65535 in user-data:

sed -i 's/^OPTIONS=.*/OPTIONS=\\\"--default-ulimit nofile=65535:65535\\\"/' /etc/sysconfig/docker && systemctl restart docker

Question: Isn’t 4096 a little conservative for an EKS node? Is there anything wrong with just setting this to 65535 by default in the AMI?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 7
  • Comments: 18 (7 by maintainers)

Most upvoted comments

@max-rocket-internet it is supposed to be 65535, and was originally but there was a series of PR mishaps such that several images had changes added that reduced that to 4096 or 8192.

The fun started in #186 where someone thought the setting was lower and added a PR to ‘raise’ it to 8192. This actually reduced it from 65535 to 8192, which immediately caused problems (#193). People tried to revert that change in #206 but that didn’t work. Meanwhile a fix in #205 got closed in favor of #206. But #206 didn’t work because the latest commits weren’t being included in the AMI builds. So fresh builds in #233 tried to restore the #206 reversion of #186, while the ongoing issue was tracked in #234.

In theory the current latest AMIs should be back to 65535. Any fixed versions were dated 31 March or later, as the problem still wasn’t fixed on 29 March. And even after that I heard GPU AMI’s still had the issue. https://github.com/awslabs/amazon-eks-ami/issues/233#issuecomment-478392268

@max-rocket-internet sorry for reviving this discussion, but I am confused:

  • the docker “nofile” limits are indeed fixed:
    # docker run -it --rm ubuntu bash -c "ulimit -n -H"
    65536
    
  • but the “nofile” limits of the host (the EC2 based EKS worker) are not:
    # ulimit -n -H
    4096
    # ulimit -n
    1024
    

From reading the discussion history here those 2 different values were also mixed up I think. As far as I understand, the host limits are in the last instance the ones that count, even if the docker process defines higher limits.

So this is not yet fixed, is it?

We have been running into this error on EKS nodes:

Jun 14 17:03:58 ip-10-128-13-134.ec2.internal kubelet[4364]: E0614 17:03:58.836313    4364 manager.go:337] Registration of the raw container factory failed: inotify_init: too many open files
Jun 14 17:03:58 ip-10-128-13-134.ec2.internal kubelet[4364]: F0614 17:03:58.836922    4364 kubelet.go:1344] Failed to start cAdvisor inotify_init: too many open files

The fix for us was to actually apply these changes in our userdata scripts where we bootstrap our EKS nodes in Terraform:

echo 'fs.inotify.max_user_instances = 8192' > /etc/sysctl.d/98-inotifyfix.conf
echo 'fs.inotify.max_user_watches = 524288' >> /etc/sysctl.d/98-inotifyfix.conf
sysctl --system

This overrides the 99-amazon.conf and, after applying, resolved our issue immediately. I think this needs to be fixed in the AMI as well.

Hehe, thanks @whereisaaron for the comprehensive history write up of this issue! 👍

Would it be possible to raise it again to 82920 for TiKV? We’re trying to run the TiDB database stack, but it requires higher ulimits. See pingcap/tidb-operator#299 for what I’m talking about - considering I’ve never run into ulimits like this in any other K8s providers, it shouldn’t be set so low to prevent us from running applications.

I’m hitting this too trying to deploy tikv. ulimit -n in a pod in EKS reporting as 65536 however tikv wont start, saying it expects >= 82920 For comparison, local kind cluster AND azure k8s cluster have it set to 1048576

Looks resolve to me:

$ ssh ip-10-0-27-91.eu-west-1.compute.internal
The authenticity of host 'ip-10-0-27-91.eu-west-1.compute.internal (10.0.27.91)' can't be established.
ECDSA key fingerprint is SHA256:V5QfuYz9Nlw0IhA21gYOZYCiNoEF+KsH+KB9XxfQOdw.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ip-10-0-27-91.eu-west-1.compute.internal,10.0.27.91' (ECDSA) to the list of known hosts.

       __|  __|_  )
       _|  (     /   Amazon Linux 2 AMI
      ___|\___|___|

https://aws.amazon.com/amazon-linux-2/
No packages needed for security; 3 packages available
Run "sudo yum update" to apply all updates.
[max.williams@ip-10-0-27-91 ~]$ sudo -i
[root@ip-10-0-27-91 ~]# ulimit
unlimited

Let’s hope this issue doesn’t come back again 🙏

Is this still an issue? The latest AMI version, currently v20190614, doesn’t have any additional ulimit configuration. The OPTIONS line is no longer in /etc/sysconfig/docker and I see no other ulimit tweaks in this repo.

The default systemd unit file for dockerd in /usr/lib/systemd/system/docker.service (elided for brevity) sets the defaults to infinity:

[Service]
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

Validated by looking at /proc/$(pidof dockerd)/limits:

Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             unlimited            unlimited            processes
Max open files            65536                65536                files
Max locked memory         65536                65536                bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       3830                 3830                 signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

I plan to resolve this issue around July 10 if there is no confirmation that this is still an issue.

@echoboomer the inotify limit seems like it’s independent of this issue. Would you mind opening that in a new issue so we can track a fix for it outside of this one? Thanks.