Flatcar: containerd-shim processes are leaking inotify instances with cgroups v2

This is a duplicate of https://github.com/containerd/containerd/issues/5670

But I wanted to raise an issue with Flatcar anyway:

For visibility to other FC users who might run into it
Perhaps other users don’t see this issue or have found a workaround

Since 2983.2.0 defaults to cgroupsv2, we saw this issue frequent enough where we had to roll back.

Client application proc might log something like this:

failed to create fsnotify watcher: too many open files

You can lessen the issue by increasing the default: fs.inotify.max_user_instances=8192, but sooner or later nodes still run out…

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 3
Comments: 18 (10 by maintainers)

Most upvoted comments

Thanks for the upstream bug reference, this is very easily reproducible (start a pod with /bin/false as the command under k8s, every CrashLoop leaks an inotify instance and a goroutine blocked in inotify_read). I’m testing a fix and will submit an upstream bugfix once I validated it.

jepio on Dec 1, 2021

The inotify leak fix has been merged and is part of containerd 1.6.0. This will be a part of the next alpha release (https://github.com/flatcar-linux/coreos-overlay/pull/1650).

jepio on Feb 17, 2022

I guess the comment wording (also in https://github.com/containerd/containerd/blob/main/docs/ops.md#linux-runtime-plugin) was done that way to match the config name and what is does when enabled, not the value false that is set…

pothos on Nov 29, 2021

Hi, thanks for raising this here. One first question I have for a workaround is whether the shim is really required because I read a comment that it was only needed for live restore. Would be good to try setting no_shim = true following https://www.flatcar-linux.org/docs/latest/container-runtimes/customizing-docker/#use-a-custom-containerd-configuration

pothos on Nov 29, 2021