kind: building node images requires public internet access and doesn't error if pre-pulling images fails

What happened:

The control plane did not start normally.

...
I1206 09:26:24.270020     140 round_trippers.go:553] GET https://v123-control-plane:6443/healthz?timeout=10s  in 0 milliseconds
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:118
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
        _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:255
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1581 
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
        _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:255
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1581 
error execution phase wait-control-plane
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdInit.func1
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/cmd/init.go:153
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:974
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/app/kubeadm.go:50
main.main
        _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubeadm/kubeadm.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:255
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1581

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

I have pushed the image to ghcr.io/tao12345666333/kind/node:v1.23.0-rc.1

You can just run kind create cluster --image='ghcr.io/tao12345666333/kind/node:v1.23.0-rc.1'

Anything else we need to know?:

Environment:

  • kind version: (use kind version): kind v0.12.0-alpha+92e01d72276af8 go1.17.3 linux/amd64
  • Kubernetes version: (use kubectl version): v1.23.0-rc.1
  • Docker version: (use docker info): 0.0.0-20211015105956-46f8c8b
  • OS (e.g. from /etc/os-release): Fedora 34

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 21 (21 by maintainers)

Most upvoted comments

I believe this regression has existed in prior releases, but I’d like to see this fixed in v0.13. It should not be a difficult patch just to fix the proxy part, and I think that’s the important fix.

I forgot to add:

We don’t want it to unpack and know that will error if it tries to (which it will).

Previously I patched ctr images import to support skipping unpacking so we could stop ignoring the error when side-loading images, but currently since multi-arch support we depend on letting containerd do the pulling instead of side-loading for images that are not built when building kubernetes (pause, etcd …). We used to pull those to the host with docker pull then get the contents loaded with docker save which solves a lot of issues neatly including proxy issues, but you cannot guarantee that a docker save is for any particular architecture, there’s no flag for this / docker isn’t designed for this, so instead we pull from the containerd running for the architecture directly.

We only need it to store the image layers / metadata, and defer unpacking to runtime. But the pull command doesn’t have similar support to skip unpacking and I haven’t had time to see about getting a feature like this upstream.

We could alternatively try to parse the error … 😕, or we could pull via some other mechanism and go back to side-loading.

We also need to do #2493 and ensure during building we plumb through proxy info to the node image build container, which should be a much simpler fix, and probably sufficient for most users impacted by this.

thank you for the explanation. 👍

#2493 covers the proxy issue and requires a simple fix.

For handling the error: we should not need a flag because if you added the flag right now it would always error but if we fixed the error detection (or better yet prevented the error with the no-unpack option mentioned above and in the code comments) then we still wouldn’t need the flag because we could just always “strict” error. Right now it will always error on unpacking.