kubeadm: 1.15 - kubeadm join --control-plane configures kubelet to connect to wrong apiserver

Is this a BUG REPORT or FEATURE REQUEST?

/kind bug /area HA

Versions

kubeadm version: v1.15.6

Environment: Dev

Kubernetes version: v1.15.6
Cloud provider or hardware configuration: Virtualbox
OS: CentOS 7.7
Kernel: 3.10.0-957.1.3.el7.x86_64
Others:

What happened?

kubelet.conf on additional control plane nodes created with kubeadm are configured to connect to the apiserver of the initial master instead of the one on localhost or through the load balancer. This has the consequence of all the kubelets becoming NotReady if the first master is unavailable.

Nodes used in the examples:

demomaster1test - 192.168.33.10 - initial master
demomaster2test - 192.168.33.20 - additional master
demomaster3test - 192.168.33.30 - additional master
demolb1test - 192.168.33.100 - load balancer

This example joins against the load balancer:

[root@demomaster2test ~]# kubeadm join --v 5 --discovery-token ... --discovery-token-ca-cert-hash sha256:... --certificate-key ... --control-plane --apiserver-bind-port 443 192.168.33.100:443
...
I1203 08:59:39.136338    7312 join.go:433] [preflight] Discovering cluster-info
I1203 08:59:39.136397    7312 token.go:199] [discovery] Trying to connect to API Server "192.168.33.100:443"
I1203 08:59:39.136875    7312 token.go:74] [discovery] Created cluster-info discovery client, requesting info from "https://192.168.33.100:443"
I1203 08:59:39.147704    7312 token.go:140] [discovery] Requesting info from "https://192.168.33.100:443" again to validate TLS against the pinned public key
I1203 08:59:39.156275    7312 token.go:163] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.33.100:443"
I1203 08:59:39.156294    7312 token.go:205] [discovery] Successfully established connection with API Server "192.168.33.100:443"
...
This node has joined the cluster and a new control plane instance was created

Checking the results:

[root@demomaster2test kubernetes]# grep ":443" *.conf
admin.conf:    server: https://192.168.33.100:443
bootstrap-kubelet.conf:    server: https://192.168.33.10:443
controller-manager.conf:    server: https://192.168.33.100:443
kubelet.conf:    server: https://192.168.33.10:443
scheduler.conf:    server: https://192.168.33.100:443

And this example joins directly against the initial master:

[root@demomaster3test ~]# kubeadm join --v 5 --discovery-token ... --discovery-token-ca-cert-hash sha256:... --certificate-key ... --control-plane --apiserver-bind-port 443 demomaster1test:443
...
I1203 10:43:05.585046    7232 join.go:433] [preflight] Discovering cluster-info
I1203 10:43:05.585107    7232 token.go:199] [discovery] Trying to connect to API Server "demomaster1test:443"
I1203 10:43:05.585473    7232 token.go:74] [discovery] Created cluster-info discovery client, requesting info from "https://demomaster1test:443"
I1203 10:43:05.595627    7232 token.go:140] [discovery] Requesting info from "https://demomaster1test:443" again to validate TLS against the pinned public key
I1203 10:43:05.604432    7232 token.go:163] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "demomaster1test:443"
I1203 10:43:05.604453    7232 token.go:205] [discovery] Successfully established connection with API Server "demomaster1test:443"
...
This node has joined the cluster and a new control plane instance was created

Checking the results:

[root@demomaster3test kubernetes]# grep ":443" *.conf
admin.conf:    server: https://192.168.33.100:443
bootstrap-kubelet.conf:    server: https://192.168.33.10:443
controller-manager.conf:    server: https://192.168.33.100:443
kubelet.conf:    server: https://192.168.33.10:443
scheduler.conf:    server: https://192.168.33.100:443

So in both cases kubelet.conf is configured against the initial master, while admin.conf + controller-manager.conf + scheduler.conf are all configured against the load balancer.

What you expected to happen?

kubelet.conf should have been configured to use the load balancer or local apiserver:

[root@demomaster3test kubernetes]# grep ":443" *.conf
kubelet.conf:    server: https://192.168.33.100:443 <-- like this
kubelet.conf:    server: https://192.168.33.30:443 <-- or maybe like this

I’m not sure what is best practice here. Would it make sense for the kubelet on a master to be ready if the apiserver on localhost is unavailable (if configured to use load balancer)?

How to reproduce it (as minimally and precisely as possible)?

Have not tested using the guide at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/ step by step, since we need to convert existing single master clusters to multi master. So it might work correctly if done according to instructions on completely new clusters. These steps are how we add more masters to our existing cluster:

Find existing 1.15 cluster (possibly upgraded from older versions - ours are initially a lot older)
Update kubeadm-config.yaml to include certSANS and controlPlaneEndpoint for the load balancer:

apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
apiServer:
  certSANs:
    - "192.168.33.100"
controlPlaneEndpoint: "192.168.33.100:443"

kubeadm config upload from-file --config /etc/kubernetes/kubeadm-config.yaml
rm -rf /etc/kubernetes/pki/apiserver.*
kubeadm init phase certs apiserver --config=/etc/kubernetes/kubeadm-config.yaml
Restart apiserver
Join 2 new --control-plane nodes like in the examples further up
Shutdown initial master. Watch as the kubelet on the 2 new masters become NotReady.

Anything else we need to know?

It’s an easy manual fix. Just edit the ip address in kubelet.conf. Need to do so on the workers as well. But since kubeadm already configures the other .conf-files correctly on the new masters it seems reasonable to expect kubelet.conf to be configured correctly as well. Or maybe there is some parameter I’m missing somewhere to get it right.

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 21 (10 by maintainers)

Most upvoted comments

studied the code for a bit, I think it’s because:

when adding a new worker node, kubeadm will try to render /etc/kubernetes/kubelet.conf using the discovered kubeconfig (e.g. the discovery file) and the cluster-info configmap in the kube-public namespace. but cluster-info will take precedence.
the cluster-info config map didn’t get updated when controlPlaneEndpoint is changed, or relevant certs are re-signed.

in the current kubeadm workflow, this cm is only created / updated at kubeadm init phase bootstrap-token, so there’s really no way (in the documented workflow) to inform newly joined worker that the control plane endpoint has changed.

that being said, the current workaround is to:

k -n kube-public edit cm cluster-info and manually change the server address to current controlPlaneEndpoint, or simply run kubeadm init phase bootstrap-token again
on worker node, kubeadm join phase kubelet-start

after controlPlaneEndpoint has been changed.

@blurpy

timfeirg on Dec 6, 2019

Oh wow, thanks for this report @blurpy. I’m going to check if I can reproduce this issue. Thanks for the heads up @neolit123.

/assign

ereslibre on Dec 5, 2019