cri-o: creating read-write layer with ID No such file or directory - crio reading from overlay instead of overlay2?
Description
We have a kubernetes cluster (1.16.9) in which one of the nodes is based on cri-o. (cri-o://1.16.6). From time to time there is a weird error blocking pods from getting up:
Warning FailedCreatePodSandBox 2s (x4 over 43s) kubelet, kube6 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = error creating pod sandbox with name "some-sandbox-name": error creating read-write layer with ID "a5021e65186da551b712f7dd743d712833e5f75fc727c6f937d421897d2eb9d6": Stat /var/lib/containers/storage/overlay/e17133b79956ad6f69ae7f775badd1c11bad2fc64f0529cab863b9d12fbaa5c4: no such file or directory
When I check that path, it doesn’t exist indeed, but:
- crio is set to use overlay2, so I’m not sure why it tries to load the layer from
/var/lib/containers/storage/overlay
- When I check that path, but in overlay2 -
/var/lib/containers/storage/overlay2/e17133b79956ad6f69ae7f775badd1c11bad2fc64f0529cab863b9d12fbaa5c4
- it does exist.
Is this some stale layer ref issue? if so, where should I look to clean it? Can crictl perform validation of layer tree and remove stale data? What are other reasons for such behaviour?
Steps to reproduce the issue:
- Create some pod?
Describe the results you received: Pod never starts
Describe the results you expected: Pod should run
Additional information you deem important (e.g. issue happens only occasionally):
Output of crio --version
:
crio version 1.16.6
commit: "af8faf448858335f9645b896120167d08caf7156-dirty"
Our cluster runs on bare metal. The node with crio on it is:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
...
kube6 Ready node 512d v1.16.9 10.200.0.15 <none> Ubuntu 18.04.4 LTS 5.3.0-51-generic cri-o://1.16.6
...
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 51 (24 by maintainers)
OK: full system reset then. I would stop cri-o, reboot the node,
rm -rf /var/{run,lib}/containers
, then start kubelet and cri-o.