containerd: 1.5.10 causes memory leak in mysql container; regression from 1.4.13
Description
I updated containerd.io on Fedora from 1.4.13 to 1.5.10. After this, all mysql docker containers across multiple projects began using 20+GB of memory where before they used <300MB. I restarted the computer and confirmed the problem persisted. I then rolled back to 1.4.13 and the problem was gone.
Steps to reproduce the issue
- Run a docker container with linux/amd64 arch with
ENV MYSQL_VERSION=5.7.36-0ubuntu0.18.04.1
- Start mysqld inside the container
- Observe massive memory usage in 1.5.10
Describe the results you received and expected
No regression
What version of containerd are you using?
1.5.10
Any other relevant information
runc version 1.0.3 commit: v1.0.3-0-gf46b6ba spec: 1.0.2-dev go: go1.16.15 libseccomp: 2.5.3
Linux localhost.localdomain 5.16.7-200.fc35.x86_64 #1 SMP PREEMPT Sun Feb 6 19:53:54 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Show configuration if it is related to CRI plugin.
No response
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 8
- Comments: 28 (7 by maintainers)
So to summarize, this seemingly helped me on Fedora 37
I installed 1.5 and confirmed problem still persisted. I then modified
/lib/systemd/system/containerd.service
to readLimitNOFILE=1048576
. I then restarted the service and the daemon:sudo systemctl restart service containerd
sudo systemctl daemon-reload
. Problem still persisted. I then restarted the service once more, because I thought the daemon might have cached the service (so in total I restarted service, daemon, service) and the problem was gone. Docker stable repos don’t have 1.6 available so I didn’t test that.Just to give this topic a little push here. I had the same problem. In our docker-compose file we need 3 different mysql servers and as they had no memory limits set they killed my whole system (fedora 37). A simple docker-compose up led to a system freeze and only a hard reset helped.
The change in /lib/systemd/system/containerd.service by setting the LimitNOFILE=1048576 helped!
What worked for me on Fedora 36: sudo vim /lib/systemd/system/containerd.service ->
Then:
run again: docker run --name=mysql1 -d mysql/mysql-server:5.7
After that it worked for me, where with the value set to unlimited, the container crashed in seconds. This change does not seem to be needed on Ubuntu and derivates, is there anything else messing this up?
Seems there is a bug in Systemd, if you need more than 65535 files, need to change the LimitNOFILE manually to a bigger number, such as 1048576
https://github.com/systemd/systemd/issues/6559
Maybe we need to revert this PR 4475
I haven’t tried to reproduce the mysql one, but it’s been a problem that affected many software where outside of containers the expected limits are considerably lower.
Default is 1024 soft limit.
docker.service
andcontainerd.service
have overrided that toinfinity
which can be as high as over 1 billion. Usually that just stalls the software for a much longer duration (sometimes hours), but some software affected by this may also do some allocation, which as you can guess1e3
vs1e9
is quite the delta, a million times more. So if your software would normally allocate 1MB in this scenario and that allocation code was affected by this much large limit increase, it’d now use 1TB.Normally it’s just CPU from tasks like iterating through the limit range and closing potentially (highly unlikely) open file descriptors.
True. In some projects affected that I used, I helped track down the problem area and implement a workaround. However outside of container environments where the limits are sane and typically only raised when necessary, you’re not likely to encounter this sort of problem. The bug is more to do with misconfiguration with
LimitNOFILE=infinity
being applied to the containers (and each process in those containers being run with a soft limit well above the expected1024
).In the meantime, you can start the container with
docker run --ulimit "nofile=1024:524288"
extra option, or with systemd services set a drop-in override withsystemctl edit containerd.service
with the following:docker.service
until a release is out with that updated too.systemctl restart docker
.docker run --rm -it alpine ash -c 'ulimit -Sn && ulimit -Hn'
, you should get1024
and524288
.Just a heads-up that the
LimitNOFILE=infinity
setting in bothdocker.service
andcontainerd.service
files has finally been removed from their respective projects.Possibly those changes will be part of new releases before the end of the year:
20.10
)v1.7.8v2.0
for containerd?After that happens, this issue can be considered resolved?
I’m having the same issue on Manjaro (Kernel
6.1.12-1
), and changingLimitNOFILE=1048576
change made it work again.MySQL is having https://bugs.mysql.com/bug.php?id=96525.
I wanted to add my own experience with this issue: apparently I can only reproduce it if swap is enabled. Starting the container with no swap (a.k.a running
swapoff <device>
) makes the container work just fine, and I can then reenable swap later@ThatCoffeeGuy thank you! This solved the same problem in my distro ( Manjaro linux )!
FYI this also breaks cups.
Hello. I’m experiencing the same issue.
Launching mysql:8.* works great, but mysql:5.7.* causes immediate 100% memory consumption (htop), and results in the following in /var/log/messages:
Versions:
Limits:
Creating the following custom limit for containerd does not resolve the issue. /etc/systemd/system/containerd.service.d/custom.conf
Note: Prior
systemctl show containerd | grep LimitNOFILE
would report ‘unlimited’.This change did not resolve the issue as suggested here: https://github.com/containerd/containerd/issues/3201
I think you should do daemon-reload first and then restart containerd. And it only makes sure that it works for new containers, not for existing ones.
Yes, can confirm that this is a valid workaround. It seems very likely that the increase to infinity nofile size is the problem on some systems.
I have the same issue
I implemented a temp solution using ulimit like this comment. It’s worked for me. https://github.com/docker-library/mysql/issues/579#issuecomment-1074069882