kind: kind load docker-image fails, but nodes believe image exists
What happened:
I was following the quickstart and attempting to load an image I built locally into my Kind cluster. When I run kind load docker-image my-image:tag
the command fails with
kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/api:test-kind (sha256:dd7e3f38c29dacc07dff7d561e0f894ab8e7bbc3200b1f6c374ade47e19586b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found
however, subsequent runs of kind load docker-image my-image:tag
result in the image being “present on all nodes”:
kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" found to be already present on all nodes.
Attempting to apply a manifest using this image results in CreateContainerError
:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api-56876d68b6-7mp6t 0/1 CreateContainerError 0 3s
$ kubectl describe pod api-56876d68b6-7mp6t
...
Warning Failed 12s (x2 over 13s) kubelet Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317: not found
What you expected to happen: Kind should load my local image successfully.
How to reproduce it (as minimally and precisely as possible):
$ cat repro.sh
#!/bin/bash
kind delete cluster
kind create cluster
kubectl config get-contexts
cat > Dockerfile <<EOF
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim
COPY ./app /app
EOF
mkdir -p app
cat > app/main.py <<EOF
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World!"}
EOF
docker build -t fastapi-test:latest .
kind load docker-image fastapi-test:latest # fails
kind load docker-image fastapi-test:latest # succeeds
# cleanup
rm -r app
rm Dockerfile
docker image rm fastapi-test
$ ./repro.sh
...
# first call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:latest (sha256:a6b359e4e43b14667f91079f2bbc102c9f228f376f37de627415de287b1890b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found
# second call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" found to be already present on all nodes.
...
Anything else we need to know?:
I was also trying to use skaffold
with Kind and it was producing the same error. Running skaffold dev
the first time would fail at
$ skaffold dev
...
Starting deploy...
Loading images into kind cluster nodes...
- api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Failed
loading images into kind nodes: unable to load image "api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9" into cluster: running [kind load docker-image --name kind api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9]
and then the next run would succeed at getting past this step but would fail during container creation:
Loading images into kind cluster nodes...
- api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Found
Images loaded in 48.927667ms
- deployment.apps/api configured
- service/api configured
Waiting for deployments to stabilize...
- deployment/api: Failed: Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9: not found
Environment:
- kind version: (use
kind version
): kind v0.11.1 go1.16.4 darwin/arm64 - Kubernetes version: (use
kubectl version
):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:06:30Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/arm64"}
- Docker version: (use
docker info
):
$ docker info
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
compose: Docker Compose (Docker Inc., v2.0.0-beta.6)
scan: Docker Scan (Docker Inc., v0.8.0)
Server:
Containers: 5
Running: 2
Paused: 0
Stopped: 3
Images: 22
Server Version: 20.10.7
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.25-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 4
Total Memory: 7.765GiB
Name: docker-desktop
ID: B3O3:X434:CIJJ:RGWM:KTQL:DI5A:LYAN:THWY:IHDJ:MFLM:BTTA:OQGD
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
- OS (e.g. from
/etc/os-release
): macOS 11.4
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 5
- Comments: 25 (9 by maintainers)
As alternative you could also import the the image manually using:
This worked fine on my M1 Mac machine.
Had the same issue using M1 Mac. I’m guessing that building from an image that’s not built for arm corrupts something. After I built my image from the locally built base image, load went through fine.
I have the same issue on my M1 Mac. I use rosetta 2 to open my terminal. However the kind node I create use
kind create cluster
is still arm64. You can check node arch version withdocker exec -it ${kind-worker-node} uname -m
. When I try to load amd64 image into the arm64 node, the exactly error happened. I solve the problem by adding --platform flag to build arm64 version imagesdocker build -f ./Dockerfile --platform arm64
then I can finally successfully load images into my kind cluster.I can consistently reproduce this locally on x86_64 with kind 0.17.0 (also on 0.14.0) and
kindest/node:v1.22.15
base node image when using multiple images in the command (all of them have the same base image but not sure that’s relevant/required to reproduce) like-> could it be a race condition importing these images somewhere?
I found two workarounds:
Oh, I am absolutely silly… I was fetching a local version of kind to make sure it was the version I want for my pipeline but then was just using the
kind
from$PATH
which is older…This should be totally fixed in v0.17.
I can take a look, but I just fired up 0.17 yesterday and ran into the issue. I knew immediately the problem but not if there was some way to deal with it in kind and wound up here.
As a work-around I am manually execing into the node container and running ctr myself.
This is due to containerd garbage collecting layers it doesn’t think it needs since the platform is non-native to the system. By default ctr is only retaining layers (and config) for the native platform.
ctr
’s--all-platforms
on import retains all layers.Either kind should just assume we want to keep all layers and handle that accordingly or provide a
--platform
flag for loading non-native platforms.I have same error too when I use M1 Mac, I solved this problem by compiling my own image of the arm64 architecture.
kind believes the image exists because containerd reports it exists. I think that does seem like a containerd bug but I don’t think there’s much we can reasonably do here about it, we’d need to fix that upstream.
I’d like to repro either before filing bugs against those projects myself but I haven’t been able to yet and I’ll be on some other tasks for a bit (like the Kubernetes release today).
If we can identify those for reproducing with their developers we can then upgrade containerd in kind when it is patched, though we continually keep up to date anyhow.
Sorry I had not yet and missed that bit. I’ve tested it now and sadly at least initially it does not repro:
Well not kind per se, but containerd. Tentatively this smells like a bug in docker.
This may not repro for me due to the docker version or some other reason, currently I have
20.10.6
.FWIW I don’t get the issue with this example:
I’ve tried several different versions of the
tiangolo/uvicorn-gunicorn-fastapi
images and they all lead to the same error. Is it possible there’s some issue with the base image’s layers?