moby: Kernel not freeing memory cgroup causing no space left on device

I’m seeing errors relating to cgroup running out of disk space. When starting containers, I get this error:

"oci runtime error: process_linux.go:258: applying cgroup configuration for process caused "mkdir /sys/fs/cgroup/memory/docker/406cfca0c0a597091854c256a3bb2f09261ecbf86e98805414752150b11eb13a: no space left on device""

The servers have plenty of disk space and inodes. The containers cgroup is read-only, so no-one should be filling that area of the disk.

Do cgroup limits exist? If so, what are they?

UPDATE:

$ docker info
Containers: 101
 Running: 60
 Paused: 0
 Stopped: 41
Images: 73
Server Version: 1.12.3
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host bridge null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.6.0-040600-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 40
Total Memory: 251.8 GiB
Name: sd-87633
ID: YDD7:FC5T:DCP3:ZDZO:UWP4:ZR5V:SENB:GK6N:NJGF:FB3J:T5G4:OJPZ
Docker Root Dir: /home/docker/data
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

$ uname -a
Linux sd-87633 4.6.0-040600-generic #201606100558 SMP Fri Jun 10 10:01:15 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ docker version
Client:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   6b644ec
 Built:        Wed Oct 26 22:01:48 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.3
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   6b644ec
 Built:        Wed Oct 26 22:01:48 2016
 OS/Arch:      linux/amd64

About this issue

Original URL
State: open
Created 8 years ago
Reactions: 24
Comments: 83 (43 by maintainers)

Links to this issue

linux - cgroup limit reached - no space left on device - Stack Overflow

Commits related to this issue

vendor/runc: Optionally disable kmem limits See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://github.com/opencontainers/runc/pull/1350 See: https://github.com/moby/moby/issues/2... — committed to scality/kubernetes by NicolasT 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://gi... — committed to ryarnyah/runc by ryarnyah 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://gi... — committed to ryarnyah/runc by ryarnyah 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://gi... — committed to ryarnyah/runc by ryarnyah 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://gi... — committed to ryarnyah/runc by ryarnyah 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See: https://gi... — committed to ryarnyah/runc by ryarnyah 6 years ago
Optionally disable kmem accounting See: https://github.com/scality/kubernetes/commit/b04b0506e49b1b60ea7c8b74a5ca2edbf341cd6c See: https://github.com/kubernetes/kubernetes/issues/61937 See... — committed to ryarnyah/runc by ryarnyah 6 years ago
libcontainer/cgroups: do not enable kmem on broken kernels Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer even if kmem limit is not configure... — committed to kolyshkin/runc by kolyshkin 6 years ago
libcontainer: enable to compile without kmem Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel ... — committed to kolyshkin/runc by kolyshkin 6 years ago
libcontainer/cgroups: do not enable kmem on broken kernels Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer even if kmem limit is not configure... — committed to kolyshkin/runc by kolyshkin 6 years ago
libcontainer: ability to compile without kmem Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel... — committed to kolyshkin/runc by kolyshkin 6 years ago
libcontainer: ability to compile without kmem Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel... — committed to thaJeztah/runc by kolyshkin 6 years ago
libcontainer: ability to compile without kmem Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel... — committed to clnperez/runc by kolyshkin 6 years ago
kernel: disable `CONFIG_MEMCG_KMEM` This causes kernel memory leaks when using versions of `runc` that unconditionally enable per-cgroup kernel memory resource accounting, leading to systems becoming... — committed to scality/centos-kernel by NicolasT 5 years ago
libcontainer: ability to compile without kmem Commit fe898e7862f94 (PR #1350) enables kernel memory accounting for all cgroups created by libcontainer -- even if kmem limit is not configured. Kernel... — committed to caruccio/runc by kolyshkin 6 years ago

Most upvoted comments

OK, a better workaround (than to reboot an affected node) is to do

echo 3 > /proc/sys/vm/drop_caches

periodically, e.g. from cron:

6 */12 * * * root echo 3 > /proc/sys/vm/drop_caches

I’ll be looking for something better, but it’s a kernel bug and there’s not much that we can do :-\

kolyshkin on Jul 23, 2019

We are hitting the same bug. Is this issue resolved? As I understand, for now, the workaround is to keep restarting the node when we hit it, but its not feasible.

prasadker on Jan 3, 2018

@martinlevesque kernel keeps track of page cache entries used by processes inside a container (belonging to a particular memory cgroup). Cgroup cease to exist not when there are zero processes in it, but rather when its usage is zero (i.e. then all memory is freed). Due to a shared nature of page cache, and the way the current kernels work, some of the page cache entries might still be charged to a particular memory cgroup when a container exits, leading “usage counters” greater than zero and a “dangling” cgroup as a result.

Using drop_caches, we ask the kernel to shrink the page cache, forcing the entries to be removed. It might negatively affect the overall performance short term (as some blocks on disk might need to be re-read later once needed, rather than taken from the page cache), but the result is less entries in the page cache, thus a chance for those “dangling” cgroups to decrease usage counters to zero and thus to be released.

You might use drop_caches once the number of cgroups is dangerously close to the limit, or periodically from cron, or every N container starts – and yes, this is a dirty hack, not a good solution (and yet it is much better than restart).

The solution is to use the kernel with the above mentioned patches (or backporting those patches to the kernel you use).

Other possible workaround might be to not enable kernel memory accounting for all containers (ie reverting https://github.com/opencontainers/runc/pull/1350).

It might also be possible to use drop_caches right from runc itself (which is ugly ugly hack but might be marginally better than using drop_caches from cron).

kolyshkin on Jul 23, 2019

we come across the same issue on docker 17.09, kernel 3.10.0. we have tried to clear the memory: echo 1 > /proc/sys/vm/drop_caches, but not work. finally, we restart the server and then solve the problem. However, we cannot restart the server every time. so please, if anyone have any other solution without restarting the system?

benzhonghai008 on Jun 27, 2018

After running Docker smoothly for over 2 years I’m getting this issue as well. Worst, I restarted and I still have 6 containers (over 45) that are not starting for this reason!

Humm

I run this setup for over a year now. The 6 containers that are not starting are all caddy 0.10.14 containers. I have other caddy’s container that runs normally.

All the commands I ran

uname -a; echo; echo;
docker info; echo; echo;
docker version; echo; echo;
docker ps -a | wc -l; echo; echo;
ls -l -F /sys/fs/cgroup/memory/docker/ | grep / | wc -l; echo; echo;
mount | wc -l; echo; echo;
cat /proc/cgroups | grep memory; echo; echo;
cat /proc/self/mountinfo | wc -l; echo; echo;
ls -1 /sys/fs/cgroup/cpuset/docker | wc -l; echo; echo;
find /sys/fs/cgroup/memory -type d ! -path '/sys/fs/cgroup/memory/docker*' | wc -l

Results

root@my-vps:~/deploy-setup# uname -a; echo; echo;
Linux my-vps 4.10.0-24-generic #28-Ubuntu SMP Wed Jun 14 08:14:34 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

root@my-vps:~/deploy-setup# docker info; echo; echo;

Containers: 45
 Running: 44
 Paused: 0
 Stopped: 1
Images: 49
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: fmbi1a5nn9sp5o4qy3eyazeq5
 Is Manager: true
 ClusterID: lzc3rrzjgu41053qywhym8jdg
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 123.123.123.23
 Manager Addresses:
  123.123.123.23:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.10.0-24-generic
Operating System: Ubuntu 16.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.782GiB
Name: my-vps
ID: X5WW:PFNN:WZU7:OMCH:EXFN:N6TL:KMS4:GEHQ:WJLZ:J7DS:IHWX:I5JZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: devmtl
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

root@my-vps:~/deploy-setup# docker version; echo; echo;
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

root@my-vps:~/deploy-setup# docker ps -a | wc -l; echo; echo;
46

root@my-vps:~/deploy-setup# ls -l -F /sys/fs/cgroup/memory/docker/ | grep / | wc -l; echo; echo;
44

root@my-vps:~/deploy-setup# mount | wc -l; echo; echo;
172

root@my-vps:~/deploy-setup# cat /proc/cgroups | grep memory; echo; echo;
memory	2	278	1

root@my-vps:~/deploy-setup# cat /proc/self/mountinfo | wc -l; echo; echo;
172

root@my-vps:~/deploy-setup# ls -1 /sys/fs/cgroup/cpuset/docker | wc -l; echo; echo;
61

root@my-vps:~/deploy-setup# find /sys/fs/cgroup/memory -type d ! -path '/sys/fs/cgroup/memory/docker*' | wc -l
201

pascalandy on Nov 21, 2018

There are no guarantees, but I’ve tested this for over a half year, and have tried 3.10, 4.4, 4.14, 4.18, 5.0 version of Linux. It seems that updating kernel to 4.18 or a higher version will fix this issue.

I tested this by creating a deployment that keeps running containers that request more memory than they are limited, therefore they will be OOMKilled once they are created. On 4.14 or a lower version of Linux, this will ultimately cause docker or even ps command hangs. I can only reboot the whole node to recover from that hanging.

This never happened again after I upgrade Linux to 4.18.

About how ps can hang forever, here is an article.

lentil1016 on May 23, 2019

yes, it’s exactly kernel 3.10’s bug. I tested in CentOS 7.4（3.10.0-693.11.1.el7）the latest official stable kernel has the same problem. There are many legacy system depends on CentOS 7.x kernel in my production environment, so we can’t upgrade to 4.x kernel .

qkboy on Apr 2, 2018

We are hitting the same error and found the reason by many testing.The root cause is kernel cgroup bug.(My operation system is CentOS 7.3 with 3.10.0-514.10.2.el7.x86_64 kernel version)

If you create container with enabled cgroup kernel memory option, then will hit this bug. When you delete the container, you maybe see the cgroup memory number decrease as expected. But if you tested carefully for a long time. you will find the cgroup kernel memory space actully leaked not release. I wrote a test case to reproduce this issue related to: https://github.com/kubernetes/kubernetes/issues/61937

Tested by docker can also reproduce :

# docker run -d --name test001 --kernel-memory 100M sshdserver:v1 
WARNING: You specified a kernel memory limit on a kernel older than 4.0. Kernel memory limits are experimental on older kernels, it won't work as expected and can cause your system to be unstable.

We also found k8s 1.9 auto enable this cgroup kernel memory feature by default and k8s 1.6 disable by default. So if you run k8s 1.9 in CentOS 7.x , you must change the code to disable this option.

More detail can see: http://www.linuxfly.org/kubernetes-19-conflict-with-centos7/ (written in Chinese)

qkboy on Mar 30, 2018

I noticed that this issue occures faster when i create a faulty service (that always fails when starting) and than set restart to always - wait about 4 days and this issue pops up. Normally it takes about a month until a node starts throwing those errors.

So maybe it’s related to that?

PhilippHeuer on Mar 8, 2018

@kolyshkin Thanks for your valuable feedback. I have a kubernetes cluster running this issue for a year, and have to reboot each node after 5~7 days of uptime. You can reach me if you wish any kind of information.

Regards,

edernucci on Jul 23, 2019

Same issue here, Ubuntu 18.04, docker 17.03, kernel 4.15.0-1025-aws

The memory cgroup just grows and grows until a reboot is required to launch new containers. In my case, once near the limit I quickly reach 100% cpu utilization and the server becomes unresponsive.

Containers: 66
 Running: 64
 Paused: 2
 Stopped: 0
Images: 32
Server Version: 17.03.3-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6c463891b1ad274d505ae3bb738e530d1df2b3c7
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-1025-aws
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.1 GiB
Name: ip-172-31-17-38
ID: 5Y2E:3AWK:6JVH:Y3K2:CQDR:JN34:V6SR:VZXF:2WWO:R7F2:3GEY:6ECH
Docker Root Dir: /var/lib/evaldocker
Debug Mode (client): false
Debug Mode (server): false
Username: toniceval
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

erulabs on Nov 14, 2018

@mlaventure writes:

@BenHall Every directory under the mount point (including the mount point) is considered to be a cgroup. So to get the actual number of cgroups from the FS you would have to run: find /sys/fs/cgroup/memory -type d | wc -l and that should match the number found in /proc/cgroups

It turns out that this is not always the case. I corresponded with a Linux cgroups maintainer (Michal Hocko) recently, who said:

Please note that memcgs are completely removed after the last memory accounted to them disappears. And that happens lazily on the memory pressure. So it is quite possible that this happens much later than the actual rmdir on the memcg.

So, it’s not uncommon for the num_cgroups value in /proc/cgroups to differ from what you might see in lscgroup.

otterley on Apr 12, 2018

@qkboy may be worth opening a ticket with Red Hat to backport the fix to their 3.10.x kernel

thaJeztah on Apr 2, 2018

There is indeed a kernel memory leak up to 4.0 kernel release. You can follow this link for details: https://github.com/moby/moby/issues/6479#issuecomment-97503551

frol on Mar 30, 2018

Nothing stand out in the mountinfo output unfortunately. This may be an issue with that version of the kernel, but I haven’t found any reference to a similar issue as of now. I’m having a look at the cgroup_rmdir code just in case.

mlaventure on Dec 22, 2016

OK, I’ve not some news good and bad.

Good news is the problem is supposedly fixed in v5.3-rc1 kernel (see patches from Roman Gushchin, on top of patches by Vladimir Davydov). For overall description, see https://lwn.net/Articles/790384/.

Bad news is I don’t know what Docker Engine can do to work around the problem in earlier kernels. One approach would be to revert https://github.com/opencontainers/runc/pull/1350 (i.e. not enable kmem acct by default), but I doubt that would be accepted.

Looking for other alternatives…

kolyshkin on Jul 22, 2019

About the memory cgroups leaking; reading this comment: https://github.com/moby/moby/issues/24559#issuecomment-232436302

Also tracked here: https://bugzilla.kernel.org/show_bug.cgi?id=124641 The fix is also going to backport to 4.4. https://lkml.org/lkml/2016/7/13/864

thaJeztah on Jul 11, 2019

Thanks, restarting helped us too, resets the number and allows containers to start again.

BenHall on Dec 22, 2016

@BenHall what else do you have under /sys/fs/cgroup/memory/ excluding the docker dir? (something like find /sys/fs/cgroup/memory -type d ! -path '/sys/fs/cgroup/memory/docker*' should work).

You can also check that you don’t have the memory cgroup mounted somewhere else by checking the output of mount or cat /proc/self/mountinfo

mlaventure on Dec 21, 2016