Flatcar: containerd-shim processes are leaking inotify instances with cgroups v2

This is a duplicate of https://github.com/containerd/containerd/issues/5670

But I wanted to raise an issue with Flatcar anyway:

  1. For visibility to other FC users who might run into it
  2. Perhaps other users don’t see this issue or have found a workaround

Since 2983.2.0 defaults to cgroupsv2, we saw this issue frequent enough where we had to roll back.

Client application proc might log something like this:

failed to create fsnotify watcher: too many open files

You can lessen the issue by increasing the default: fs.inotify.max_user_instances=8192, but sooner or later nodes still run out…

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 3
  • Comments: 18 (10 by maintainers)

Most upvoted comments

Thanks for the upstream bug reference, this is very easily reproducible (start a pod with /bin/false as the command under k8s, every CrashLoop leaks an inotify instance and a goroutine blocked in inotify_read). I’m testing a fix and will submit an upstream bugfix once I validated it.

The inotify leak fix has been merged and is part of containerd 1.6.0. This will be a part of the next alpha release (https://github.com/flatcar-linux/coreos-overlay/pull/1650).

I guess the comment wording (also in https://github.com/containerd/containerd/blob/main/docs/ops.md#linux-runtime-plugin) was done that way to match the config name and what is does when enabled, not the value false that is set…

Hi, thanks for raising this here. One first question I have for a workaround is whether the shim is really required because I read a comment that it was only needed for live restore. Would be good to try setting no_shim = true following https://www.flatcar-linux.org/docs/latest/container-runtimes/customizing-docker/#use-a-custom-containerd-configuration