Flatcar: containerd-shim processes are leaking inotify instances with cgroups v2
This is a duplicate of https://github.com/containerd/containerd/issues/5670
But I wanted to raise an issue with Flatcar anyway:
- For visibility to other FC users who might run into it
- Perhaps other users don’t see this issue or have found a workaround
Since 2983.2.0
defaults to cgroupsv2, we saw this issue frequent enough where we had to roll back.
Client application proc might log something like this:
failed to create fsnotify watcher: too many open files
You can lessen the issue by increasing the default: fs.inotify.max_user_instances=8192
, but sooner or later nodes still run out…
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 3
- Comments: 18 (10 by maintainers)
Thanks for the upstream bug reference, this is very easily reproducible (start a pod with /bin/false as the command under k8s, every CrashLoop leaks an inotify instance and a goroutine blocked in inotify_read). I’m testing a fix and will submit an upstream bugfix once I validated it.
The inotify leak fix has been merged and is part of containerd 1.6.0. This will be a part of the next alpha release (https://github.com/flatcar-linux/coreos-overlay/pull/1650).
I guess the comment wording (also in https://github.com/containerd/containerd/blob/main/docs/ops.md#linux-runtime-plugin) was done that way to match the config name and what is does when enabled, not the value
false
that is set…Hi, thanks for raising this here. One first question I have for a workaround is whether the shim is really required because I read a comment that it was only needed for live restore. Would be good to try setting
no_shim = true
following https://www.flatcar-linux.org/docs/latest/container-runtimes/customizing-docker/#use-a-custom-containerd-configuration