cri-o: Error: Image not known after upgrade cri-o upgrade
Description
I am updating cri-o across the nodes in my Kubernetes cluster. Prior to upgrade, everything seems to be working fine. After upgrade, some (but not all) workloads will no longer run. They get stuck in a CreateContainerError
state. When I describe the pods for these workloads, I see the following:
Successfully assigned wazuh/wazuh-7bb996795-2zjfb to ip-10-240-5-229.eu-central-1.compute.internal
Normal Pulled 5m33s (x8 over 7m4s) kubelet, ip-10-240-5-229.eu-central-1.compute.internal Successfully pulled image "datica/wazuh:3.6.1"
Warning Failed 5m33s (x8 over 7m4s) kubelet, ip-10-240-5-229.eu-central-1.compute.internal Error: image not known
Also, when I attempt to delete the pods, they seem to get stuck in a Terminating
state.
Steps to reproduce the issue:
- Upgrade coreos on hosts from v1967.6.0 to v2191.5.0 (to get glibc 2.29)
- Upgrade cri-o from v1.13.3 to v1.15.1-dev
- Upgrade Kubernetes control plane components from 1.13.10 to 1.15.3 (I’m testing now to see if this issue occurs without upgrading k8s components, will update after)
Describe the results you received:
Workloads are no longer running correctly. Some can’t start, and have the error above in their events. Deleting pods results in them being stuck indefinitely in a Terminating
state.
Describe the results you expected: All workloads would continue running.
Additional information you deem important (e.g. issue happens only occasionally): I found one Redhat thread that discussed a similar issue and indicated upgrading cri-o from certain versions could cause problems. Not sure if it’s relevant though, and the thread did not really explain why their issue was occurring.
Output of crio --version
:
Prior to upgrade:
crio version 1.13.3
commit: "5a3c24900797986fd3f1f39094aeea8c4a4354ef"
After upgrade:
crio version 1.15.1-dev
commit: "7eb0fb039a8e379bcda319826a34caec983e8519-dirty"
Additional environment details (AWS, VirtualBox, physical, etc.): CoreOS nodes running in AWS.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16 (8 by maintainers)
also 1.14.10 would probably be best too
can you first try to upgrade to the latest on the 1.13 branch before going to 1.15? There were known issues in container/image storage that were fixed after 1.13.9