kind: Cannot create cluster due to `docker exec cat /kind/version` failing

What happened:

Kind failed to create a cluster with

ERROR: failed to create cluster: failed to generate kubeadm config content: failed to get kubernetes version from node: failed to get file: command "docker exec --privileged flux-e2e-2-control-plane cat /kind/version" failed with error: exit status 1

Full logs at https://circleci.com/gh/fluxcd/flux/9822

What you expected to happen:

I expect the cluster to be created without errors

How to reproduce it (as minimally and precisely as possible):

Unfortunately it’s not systematic. It only happens from time to time

It happens when running make e2e in https://github.com/fluxcd/flux/

Anything else we need to know?:

Although things have improved quite a bit since Kind 0.7.0 we are still getting some cluster creation errors from time to time. More info at https://github.com/fluxcd/flux/issues/2672

This may have to do with the fact that we create a few (4) Kind clusters in parallel. See https://github.com/fluxcd/flux/blob/cb319f2f7878f33369cf54139df26867aa6632ca/test/e2e/run.bash#L55

Environment:

  • kind version: (use kind version): 0.7.0
  • Kubernetes version: (use kubectl version): 1.14.10
  • Docker version: (use docker info): 18.09.3
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.5 LTS

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 4
  • Comments: 45 (24 by maintainers)

Most upvoted comments

https://kind.sigs.k8s.io/docs/user/known-issues/#docker-installed-with-snap

snap is in the known-issue document, the snap docker package has a number of issues, e.g. no access to temp directories. I don’t recommend snap for docker and we don’t really support this.

I’m going to close this out for now given that 0.8.0 / 0.8.1 is out with various fixes including more details in the errors, however for this particular error you need to try again with --retain (so it doesn’t clean up after itself) and:

  • check the docker container logs
  • check docker inspect

This ~always means the node container died, which is generally due to some problem with the host environment such as:

  • not enough resources to run everything
  • wrong architecture
  • in older kind versions, you might have some other issues we’ve documented in known issues such as your host filesystem being incompatble with docker in docker (zfs, btrfs; these are fixed) or using docker installed via snap (still not recommended)

Please file individual bugs when you encounter this in your environment and include as much detail as possible. This error just means the node container exited early so we are unable to do anything with it, and is probably a bug in your environment.