kind: multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart
Kind Kubernetes cluster does not survive Docker restart. It seems that docker assigns new IPs to containers on each start-up. The KIND nodes however have original IP addresses specified in the generated configuration files causing kubernetes services unable to talk to each other. The most affected ones are scheduler and controller.
What happened:
Kubernetes starts in broken state even though kubectl get pods -A
reports otherwise (everything 1/1). The cluster is unable start deployed pods (if deployed before restart) and is unable to deploy anything new due to scheduler is not connected to apiserver.
What you expected to happen:
Kubernetes cluster continues working as expected even after Docker restart.
How to reproduce it (as minimally and precisely as possible):
- Install KIND cluster by issuing:
cat <<EOF | kind create cluster --name kind --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
EOF
- Restart Docker
- Deploy anything, e.g.:
kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
- Check
dnsutils
are inPending
state
Anything else we need to know?:
- Log files: kind-cluster-logs.tar.gz
- I tried to change IP addresses in
/kind
and/etc/kubernets
files but than the services start complaining about certificate not issued for the IP address. Changing IP addresses each time the cluster starts is therefor not a solution.
Environment:
- kind version: (use
kind version
):
kind v0.10.0 go1.15.7 darwin/amd64
- Kubernetes version: (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-21T01:11:42Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
- Docker version: (use
docker info
):
Client:
Version: 20.10.0
API version: 1.41
Go version: go1.15.6
Git commit: 03fa4b8
Built: Sat Dec 12 20:00:39 2020
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.2
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 8891c58
Built: Mon Dec 28 16:15:28 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0
- OS (e.g. from
/etc/os-release
): macOS Catalina 10.15.7 Intel
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 2
- Comments: 34 (20 by maintainers)
Of course, just imagine
I start a course that lasts 2 days (when it could well be 5),
I think kind is almost perfect to teach (and learn) but this issue continue being a little headache
Best Regards
I hope I understand it better now. Let me summarize and please correct me if I’m wrong:
/kind/
and/etc/kubernets/
directories.What I’ve found is that the second step does something silly. Originally I made myself a shell script which re-configures the IP addresses to mach the current state, but that fails as well as all the security certificates generated in step 1 are based on IP addresses in their CNs. Which leads to a situation that the services are finally able to contact to each other but they reject the certificates and we are back in square one.
I see only two ways to solve this issue permanently:
I hope I got the idea.
@boldandbusted Thanks for share your repo but the idea is avoid install any VMs or take time for explain another tools. This is for the target public, many students are developers, architects, and sometimes decision makers, so, I want focus on kubernetes and his benefits and not get noise from another tools or setups. For now I got a config for multi control plane node to use over the course ending but it lost the enchant of working hands on from the beginning and discover it for your self
Hi,
thanks for disabling the bot !
the use case is simple : you work on a project and need to have a stable multinode env. You need to rebuild the cluster each day or each reboot.
The goal of kind is to simulate an env, but if one needs to rebuild each time you reboot, it is clearly a game breaker for kind.
I stopped using kind and built a real cluster, until issue solved.
thanks. regards.
(Thanks @tnqn !)
This should be fixed for most multi-node clusters in the latest sources at HEAD, and in the forthcoming v0.15.0 (TBD, we’ll want to wrap up some other things and make sure this is working widely before cutting a release).
#1689 remains for tracking clusters with multiple control-plane nodes (“HA”) which we haven’t dug into yet.