podman: [v.low] [usability] OOM results in "cannot start a container that has stopped"
This is very low priority: a hope that there’s a way to provide a more helpful diagnostic for this error.
Setup: VM (m1.medium) with no swap. Force an OOM:
# podman --cgroup-manager=cgroupfs run -d -i -m 5246976 registry.access.redhat.com/rhel7/rhel:latest /bin/bash
a5f0b470febacb36beb34be61ea35011eb5f07fd83d2ea52dbc426d06b92189b
# podman --cgroup-manager=cgroupfs run -d -i -m 5246976 registry.access.redhat.com/rhel7/rhel:latest /bin/bash
cannot start a container that has stopped
`/usr/bin/runc start 447c64007114501935cad8be3cf0be0d02f18b6b421f0f468a7dd9e5a86454b1` failed: exit status 1
(Normally the second one fails, but sometimes it takes three). Actual error is from runc, and is from an OOM. System log shows:
Oct 02 16:22:41 esm-f29-1 kernel: Memory cgroup out of memory: Kill process 5379 (runc:[2:INIT]) score 1791 or sacrifice child
Oct 02 16:22:41 esm-f29-1 kernel: Killed process 5379 (runc:[2:INIT]) total-vm:481600kB, anon-rss:4296kB, file-rss:4756kB, shmem-rss:0kB
Oct 02 16:22:41 esm-f29-1 kernel: oom_reaper: reaped process 5379 (runc:[2:INIT]), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
This was entirely my fault: a mistake cleaning up my ansible setup code resulted in a swapless system. The error message, though, led me on some wild-goose chases. Is there any low-effort way to catch this and offer a more helpful message? If not, please close: my other reason for filing is in case someone in the future runs a web search. TIA.
podman-0.9.3.1-1.git1cd906d.fc29.x86_64 runc-1.0.0-55.dev.git578fe65.fc29.x86_64
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 16 (11 by maintainers)
@haircommander Could you take a look at this?