kind: kind load docker-image fails, but nodes believe image exists

What happened: I was following the quickstart and attempting to load an image I built locally into my Kind cluster. When I run kind load docker-image my-image:tag the command fails with

kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/api:test-kind (sha256:dd7e3f38c29dacc07dff7d561e0f894ab8e7bbc3200b1f6c374ade47e19586b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found

however, subsequent runs of kind load docker-image my-image:tag result in the image being “present on all nodes”:

kind load docker-image api:test-kind
Image: "api:test-kind" with ID "sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317" found to be already present on all nodes.

Attempting to apply a manifest using this image results in CreateContainerError:

$ kubectl get pods
NAME                   READY   STATUS                 RESTARTS   AGE
api-56876d68b6-7mp6t   0/1     CreateContainerError   0          3s

$ kubectl describe pod api-56876d68b6-7mp6t
...
Warning  Failed     12s (x2 over 13s)  kubelet            Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:721afbc9e128b90662115740df0afd5cda80d33ee743d6cdf1abe1106b098317: not found

What you expected to happen: Kind should load my local image successfully.

How to reproduce it (as minimally and precisely as possible):

$ cat repro.sh
#!/bin/bash

kind delete cluster
kind create cluster
kubectl config get-contexts

cat > Dockerfile <<EOF
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim

COPY ./app /app
EOF

mkdir -p app
cat > app/main.py <<EOF
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
def read_root():
    return {"Hello": "World!"}
EOF

docker build -t fastapi-test:latest .

kind load docker-image fastapi-test:latest  # fails
kind load docker-image fastapi-test:latest  # succeeds

# cleanup
rm -r app
rm Dockerfile
docker image rm fastapi-test

$ ./repro.sh
...
# first call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" not yet present on node "kind-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import -" failed with error: exit status 1
Command Output: unpacking docker.io/library/fastapi-test:latest (sha256:a6b359e4e43b14667f91079f2bbc102c9f228f376f37de627415de287b1890b5)...ctr: content digest sha256:87c8a1d8f54f3aa4e05569e8919397b65056aa71cdf48b7f061432c98475eee9: not found

# second call
Image: "fastapi-test:latest" with ID "sha256:08454bdfe6f40f8c1cfd5e1234de319aa0e1b4e6b1b4ac13f183320fc27b7120" found to be already present on all nodes.
...

Anything else we need to know?: I was also trying to use skaffold with Kind and it was producing the same error. Running skaffold dev the first time would fail at

$ skaffold dev
...
Starting deploy...
Loading images into kind cluster nodes...
 - api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Failed
loading images into kind nodes: unable to load image "api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9" into cluster: running [kind load docker-image --name kind api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9]

and then the next run would succeed at getting past this step but would fail during container creation:

Loading images into kind cluster nodes...
 - api:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9 -> Found
Images loaded in 48.927667ms
 - deployment.apps/api configured
 - service/api configured
Waiting for deployments to stabilize...
 - deployment/api: Failed: Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:58f6832b421fbc0a592454a1a8fce96dbfde35c850c27b05e191f904af93d6b9: not found

Environment:

  • kind version: (use kind version): kind v0.11.1 go1.16.4 darwin/arm64
  • Kubernetes version: (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:06:30Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/arm64"}
  • Docker version: (use docker info):
$ docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
  compose: Docker Compose (Docker Inc., v2.0.0-beta.6)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 5
  Running: 2
  Paused: 0
  Stopped: 3
 Images: 22
 Server Version: 20.10.7
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d71fcd7d8303cbf684402823e425e9dd2e99285d
 runc version: b9ee9c6314599f1b4a7f497e1f1f856fe433d3b7
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.25-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: aarch64
 CPUs: 4
 Total Memory: 7.765GiB
 Name: docker-desktop
 ID: B3O3:X434:CIJJ:RGWM:KTQL:DI5A:LYAN:THWY:IHDJ:MFLM:BTTA:OQGD
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
  • OS (e.g. from /etc/os-release): macOS 11.4

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 5
  • Comments: 25 (9 by maintainers)

Most upvoted comments

As alternative you could also import the the image manually using:

docker save <image> | docker exec --privileged -i kind-control-plane ctr --namespace=k8s.io images import --all-platforms -

This worked fine on my M1 Mac machine.

Had the same issue using M1 Mac. I’m guessing that building from an image that’s not built for arm corrupts something. After I built my image from the locally built base image, load went through fine.

I have the same issue on my M1 Mac. I use rosetta 2 to open my terminal. However the kind node I create use kind create cluster is still arm64. You can check node arch version with docker exec -it ${kind-worker-node} uname -m. When I try to load amd64 image into the arm64 node, the exactly error happened. I solve the problem by adding --platform flag to build arm64 version images docker build -f ./Dockerfile --platform arm64 then I can finally successfully load images into my kind cluster.

I can consistently reproduce this locally on x86_64 with kind 0.17.0 (also on 0.14.0) and kindest/node:v1.22.15 base node image when using multiple images in the command (all of them have the same base image but not sure that’s relevant/required to reproduce) like

> kind load docker-image --name test imagea:latest imageb:latest imagec:latest imaged:latest
Image: "" with ID "sha256:a123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:b123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:c123..." not yet present on node "test-control-plane", loading...
Image: "" with ID "sha256:d123..." not yet present on node "test-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i test-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -" failed with error: exit status 1
Command Output: ctr: image "imagea:latest": already exists

-> could it be a race condition importing these images somewhere?


I found two workarounds:

  • using the underlying commands as proposed above:
    docker save imagea:latest imageb:latest imagec:latest imaged:latest | docker exec --privileged -i test-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -
    
  • importing images one after another which would (needlessly) copy the base image over and over again instead of having this in a single tar (but I’m not sure whether kind is even doing that optimisation - is it?)

Oh, I am absolutely silly… I was fetching a local version of kind to make sure it was the version I want for my pipeline but then was just using the kind from $PATH which is older…

This should be totally fixed in v0.17.

I can take a look, but I just fired up 0.17 yesterday and ran into the issue. I knew immediately the problem but not if there was some way to deal with it in kind and wound up here.

As a work-around I am manually execing into the node container and running ctr myself.

This is due to containerd garbage collecting layers it doesn’t think it needs since the platform is non-native to the system. By default ctr is only retaining layers (and config) for the native platform. ctr’s --all-platforms on import retains all layers.

Either kind should just assume we want to keep all layers and handle that accordingly or provide a --platform flag for loading non-native platforms.

I have same error too when I use M1 Mac, I solved this problem by compiling my own image of the arm64 architecture.

kind believes the image exists because containerd reports it exists. I think that does seem like a containerd bug but I don’t think there’s much we can reasonably do here about it, we’d need to fix that upstream.

I’d like to repro either before filing bugs against those projects myself but I haven’t been able to yet and I’ll be on some other tasks for a bit (like the Kubernetes release today).

If we can identify those for reproducing with their developers we can then upgrade containerd in kind when it is patched, though we continually keep up to date anyhow.

Thanks for the reply. Did you try the repro.sh script I put under the “How to reproduce it” section? It should create a app/ that reproduces (at least on my machine):

Sorry I had not yet and missed that bit. I’ve tested it now and sadly at least initially it does not repro:

+ kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" not yet present on node "kind-control-plane", loading...
+ kind load docker-image fastapi-test:latest
Image: "fastapi-test:latest" with ID "sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec" found to be already present on all nodes.
+ rm -r app
+ rm Dockerfile
+ docker image rm fastapi-test
Untagged: fastapi-test:latest
Deleted: sha256:6ecf0858e5af96b455496e303381cc2198dc58a636236a05a99953920294a3ec

So I guess the resulting image is missing some layer that Kind expects?

Well not kind per se, but containerd. Tentatively this smells like a bug in docker.

This may not repro for me due to the docker version or some other reason, currently I have 20.10.6.

FWIW I don’t get the issue with this example:

$ cat Dockerfile
FROM debian:buster-slim

CMD ["sleep", "9999"]

$ docker build -t sleepy:latest .

$ kind load docker-image sleepy:latest
Image: "sleepy:latest" with ID "sha256:5cdc87a60cc668c0d9a48f70539ae21fc96259575a58ff76dff411e34931bdf8" not yet present on node "kind-control-plane", loading...

I’ve tried several different versions of the tiangolo/uvicorn-gunicorn-fastapi images and they all lead to the same error. Is it possible there’s some issue with the base image’s layers?