containerd: Inconsistent state on pod termination
Description
We just had an issue with containerd: an application was killed several times by the oom killer because it reached its cgroup memory limit. Containers on the host are now in a really weird state:
- ok according to
crictl ps
crictl exec
fails withcannot exec in a stopped state: unknown
ctr -n k8s.io t ls
hangs without any outputps auxf
shows many containerd-shim without any child process (or sometime only the pause container)runc --root /run/containerd/runc/k8s.io list
shows some containers instopped
state- the associated
containerd-shim
process is still running without any child
It seems that sometimes when a container process is oom-killed because it has reached its cgroup memory limit the containerd state becomes inconsistent. Once this has happened it’s no longer possible to delete containers. When trying to delete a pod, the containerd logs show:
- containerd tries to stop it (StopContainer)
- stop container xx timed out
- then error=“an error occurs during waiting for container xxx to stop: wait container xxx is cancelled”
- the container is stopped but not removed
Steps to reproduce the issue:
- Run kubernetes using containerd as CRI
- Create a pod with a memory limit
- Allocate more memory than the limit
- After several OOM kills, it should no longer be possible to interact with containerd
Describe the results you received: containerd seems to be stuck in a inconsistent state and no longer able to fulfill CRI requests
Describe the results you expected: containerd should clean up oom killed containers and remain consistent
Output of containerd --version
:
containerd --version
containerd github.com/containerd/containerd v1.1.0 209a7fc3e4a32ef71a8c7b50c68fc8398415badf```
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 25 (11 by maintainers)
@crosbymichael @lbernail and I spent sometime to debug this issue last week, and we found the suspicious stack dump.
containerd stack dump:
containerd-shim stack dump:
Our current theory is that if an exec process forks another process, and the new process holds the IO open after the exec process dies, this may happen. I haven’t reproduced this yet, but based on the use case described by @lbernail, it is not impossible that this could happen.
Ya, i’ll have to look into this
I’m using gke, which is using 1.1.0.
On Fri, Jul 6, 2018 at 6:49 PM Mike Brown notifications@github.com wrote: