kubeadm: 1.15 - kubeadm join --control-plane fails on clusters created with <= 1.12

Versions

kubeadm version (use kubeadm version): v1.15.6

Environment: Dev

  • Kubernetes version (use kubectl version): v1.15.6
  • Cloud provider or hardware configuration: Virtualbox
  • OS (e.g. from /etc/os-release): CentOS 7.7
  • Kernel (e.g. uname -a): 3.10.0-957.1.3.el7.x86_64

What happened?

I have several clusters created with kubeadm v1.10 to v1.12, that have been upgraded along the way. Currently on 1.14 and 1.15. I’m experimenting with adding more masters to setup HA. Adding masters on clusters created with kubeadm 1.15 is working fine, but when adding masters to older clusters upgraded to 1.15 it fails waiting for etcd nodes to join.

This is a continuation of #1269, which doesn’t seem to be properly resolved. The original issue relates to etcd not listening on a host port, so it’s not possible for the new node to connect. That was fixed. However, the etcd member list seems to be untouched, so it looks as follows:

/ # export ETCDCTL_API=3
/ # etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list -w table
+-----------------+---------+-----------------+------------------------+----------------------------+
|       ID        | STATUS  |      NAME       |       PEER ADDRS       |        CLIENT ADDRS        |
+-----------------+---------+-----------------+------------------------+----------------------------+
| a874c87fd42044f | started | demomaster1test | https://127.0.0.1:2380 | https://192.168.33.10:2379 |
+-----------------+---------+-----------------+------------------------+----------------------------+

First master: demomaster1test (192.168.33.10). Second master: demomaster2test (192.168.33.20). (To be added)

From the join on the second control plane node we can see that it successfully adds the second etcd member to the cluster using the correct address, then receives a member list with the localhost address of the first member, and then eventually times out:

[root@demomaster2test ~]# kubeadm join --v 5 --discovery-token ... --discovery-token-ca-cert-hash sha256:... --certificate-key ... --control-plane --apiserver-bind-port 443 demomaster1test:443
...
[check-etcd] Checking that the etcd cluster is healthy
I1202 10:53:45.999198    7391 local.go:66] [etcd] Checking etcd cluster health
I1202 10:53:45.999206    7391 local.go:69] creating etcd client that connects to etcd pods
I1202 10:53:46.009155    7391 etcd.go:106] etcd endpoints read from pods: https://192.168.33.10:2379
I1202 10:53:46.019954    7391 etcd.go:147] etcd endpoints read from etcd: https://192.168.33.10:2379
I1202 10:53:46.020014    7391 etcd.go:124] update etcd endpoints: https://192.168.33.10:2379
I1202 10:53:46.038590    7391 kubelet.go:105] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I1202 10:53:46.094663    7391 kubelet.go:131] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
I1202 10:53:46.157154    7391 kubelet.go:148] [kubelet-start] Starting the kubelet
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I1202 10:53:46.752724    7391 kubelet.go:166] [kubelet-start] preserving the crisocket information for the node
I1202 10:53:46.752743    7391 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "demomaster2test" as an annotation
I1202 10:54:07.768179    7391 local.go:118] creating etcd client that connects to etcd pods
I1202 10:54:07.773539    7391 etcd.go:106] etcd endpoints read from pods: https://192.168.33.10:2379
I1202 10:54:07.785011    7391 etcd.go:147] etcd endpoints read from etcd: https://192.168.33.10:2379
I1202 10:54:07.785033    7391 etcd.go:124] update etcd endpoints: https://192.168.33.10:2379
I1202 10:54:07.785094    7391 local.go:127] Adding etcd member: https://192.168.33.20:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I1202 10:54:07.822398    7391 local.go:133] Updated etcd member list: [{demomaster1test https://127.0.0.1:2380} {demomaster2test https://192.168.33.20:2380}]
I1202 10:54:07.822411    7391 local.go:135] Creating local etcd static pod manifest file
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I1202 10:54:07.823387    7391 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.33.10:2379 https://192.168.33.20:2379]) are available 1/8
I1202 10:54:12.849166    7391 etcd.go:356] [etcd] Attempt timed out
I1202 10:54:12.849182    7391 etcd.go:348] [etcd] Waiting 5s until next retry
I1202 10:54:17.849270    7391 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.33.10:2379 https://192.168.33.20:2379]) are available 2/8
I1202 10:54:22.882184    7391 etcd.go:356] [etcd] Attempt timed out
I1202 10:54:22.882199    7391 etcd.go:348] [etcd] Waiting 5s until next retry
[kubelet-check] Initial timeout of 40s passed.
...
I1202 10:55:13.089881    7391 etcd.go:356] [etcd] Attempt timed out
I1202 10:55:13.089899    7391 etcd.go:348] [etcd] Waiting 5s until next retry
I1202 10:55:18.090404    7391 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.33.10:2379 https://192.168.33.20:2379]) are available 8/8
I1202 10:55:23.110043    7391 etcd.go:356] [etcd] Attempt timed out
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

From the logs of the first etcd we can see the second etcd joining, then the first etcd starts leader election, not getting contact with the second etcd, and then shutting down:

2019-11-29 13:37:10.254481 I | etcdserver/membership: added member da895d82fb090550 [https://192.168.33.20:2380] to cluster c9be114fc2da2776
2019-11-29 13:37:10.254516 I | rafthttp: starting peer da895d82fb090550...
2019-11-29 13:37:10.254535 I | rafthttp: started HTTP pipelining with peer da895d82fb090550
2019-11-29 13:37:10.258325 I | rafthttp: started peer da895d82fb090550
2019-11-29 13:37:10.258475 I | rafthttp: added peer da895d82fb090550
...
2019-11-29 13:37:12.164521 W | raft: a874c87fd42044f stepped down to follower since quorum is not active
...
2019-11-29 13:37:15.259627 W | rafthttp: health check for peer da895d82fb090550 could not connect: dial tcp 192.168.33.20:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
...
2019-11-29 13:38:27.923943 N | pkg/osutil: received terminated signal, shutting down...

On the second etcd we get this:

{"log":"2019-11-29 13:40:36.795497 I | etcdmain: etcd Version: 3.3.10\n","stream":"stderr","time":"2019-11-29T13:40:36.79609435Z"}
{"log":"2019-11-29 13:40:36.795961 I | etcdmain: Git SHA: 27fc7e2\n","stream":"stderr","time":"2019-11-29T13:40:36.796147993Z"}
{"log":"2019-11-29 13:40:36.795967 I | etcdmain: Go Version: go1.10.4\n","stream":"stderr","time":"2019-11-29T13:40:36.796154842Z"}
{"log":"2019-11-29 13:40:36.795970 I | etcdmain: Go OS/Arch: linux/amd64\n","stream":"stderr","time":"2019-11-29T13:40:36.796158628Z"}
{"log":"2019-11-29 13:40:36.795973 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2\n","stream":"stderr","time":"2019-11-29T13:40:36.796161734Z"}
{"log":"2019-11-29 13:40:36.796401 N | etcdmain: the server is already initialized as member before, starting as etcd member...\n","stream":"stderr","time":"2019-11-29T13:40:36.796559343Z"}
{"log":"2019-11-29 13:40:36.796454 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = \n","stream":"stderr","time":"2019-11-29T13:40:36.796579176Z"}
{"log":"2019-11-29 13:40:36.797542 I | embed: listening for peers on https://192.168.33.20:2380\n","stream":"stderr","time":"2019-11-29T13:40:36.797652533Z"}
{"log":"2019-11-29 13:40:36.797609 I | embed: listening for client requests on 127.0.0.1:2379\n","stream":"stderr","time":"2019-11-29T13:40:36.79773103Z"}
{"log":"2019-11-29 13:40:36.797710 I | embed: listening for client requests on 192.168.33.20:2379\n","stream":"stderr","time":"2019-11-29T13:40:36.797769883Z"}
{"log":"2019-11-29 13:40:36.799843 W | etcdserver: could not get cluster response from https://127.0.0.1:2380: Get https://127.0.0.1:2380/members: dial tcp 127.0.0.1:2380: connect: connection refused\n","stream":"stderr","time":"2019-11-29T13:40:36.799928537Z"}
{"log":"2019-11-29 13:40:36.800480 C | etcdmain: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given urls\n","stream":"stderr","time":"2019-11-29T13:40:36.800579653Z"}

The second etcd keeps trying to connect to the first etcd on localhost.

What we can see from the generated etcd.yaml manifest on the second master is this:

        - etcd
        - --advertise-client-urls=https://192.168.33.20:2379
        - --cert-file=/etc/kubernetes/pki/etcd/server.crt
        - --client-cert-auth=true
        - --data-dir=/var/lib/etcd
        - --initial-advertise-peer-urls=https://192.168.33.20:2380
        - --initial-cluster=demomaster1test=https://127.0.0.1:2380,demomaster2test=https://192.168.33.20:2380
        - --initial-cluster-state=existing
        - --key-file=/etc/kubernetes/pki/etcd/server.key
        - --listen-client-urls=https://127.0.0.1:2379,https://192.168.33.20:2379
        - --listen-peer-urls=https://192.168.33.20:2380
        - --name=demomaster2test

It’s configured demomaster1test at https://127.0.0.1:2380, which results in “connection refused” as we can see from the logs. Trying to change that value to https://192.168.33.10:2380 results in the following in the logs instead:

{"log":"2019-11-29 14:05:40.168103 I | etcdmain: etcd Version: 3.3.10\n","stream":"stderr","time":"2019-11-29T14:05:40.173075927Z"}
{"log":"2019-11-29 14:05:40.168157 I | etcdmain: Git SHA: 27fc7e2\n","stream":"stderr","time":"2019-11-29T14:05:40.173107421Z"}
{"log":"2019-11-29 14:05:40.168161 I | etcdmain: Go Version: go1.10.4\n","stream":"stderr","time":"2019-11-29T14:05:40.173111048Z"}
{"log":"2019-11-29 14:05:40.168163 I | etcdmain: Go OS/Arch: linux/amd64\n","stream":"stderr","time":"2019-11-29T14:05:40.17311336Z"}
{"log":"2019-11-29 14:05:40.168166 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2\n","stream":"stderr","time":"2019-11-29T14:05:40.173115555Z"}
{"log":"2019-11-29 14:05:40.168203 N | etcdmain: the server is already initialized as member before, starting as etcd member...\n","stream":"stderr","time":"2019-11-29T14:05:40.173117812Z"}
{"log":"2019-11-29 14:05:40.168377 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = \n","stream":"stderr","time":"2019-11-29T14:05:40.173121239Z"}
{"log":"2019-11-29 14:05:40.169662 I | embed: listening for peers on https://192.168.33.20:2380\n","stream":"stderr","time":"2019-11-29T14:05:40.173123704Z"}
{"log":"2019-11-29 14:05:40.169707 I | embed: listening for client requests on 127.0.0.1:2379\n","stream":"stderr","time":"2019-11-29T14:05:40.173125763Z"}
{"log":"2019-11-29 14:05:40.169732 I | embed: listening for client requests on 192.168.33.20:2379\n","stream":"stderr","time":"2019-11-29T14:05:40.173127846Z"}
{"log":"2019-11-29 14:05:40.195800 C | etcdmain: error validating peerURLs {ClusterID:c9be114fc2da2776 Members:[\u0026{ID:4654b06da302d871 RaftAttributes:{PeerURLs:[https://192.168.33.20:2380]} Attributes:{Name: ClientURLs:[]}} \u0026{ID:a874c87fd42044f RaftAttributes:{PeerURLs:[https://127.0.0.1:2380]} Attributes:{Name:demomaster1test ClientURLs:[https://192.168.33.10:2379]}}] RemovedMemberIDs:[]}: unmatched member while checking PeerURLs (\"https://127.0.0.1:2380\"(resolved from \"https://127.0.0.1:2380\") != \"https://192.168.33.10:2380\"(resolved from \"https://192.168.33.10:2380\"))\n","stream":"stderr","time":"2019-11-29T14:05:40.195988804Z"}

The configuration of the address in the manifest doesn’t match the member list and it aborts. The result in any case is that etcd on both control plane nodes shut down, and the apiserver is unavailable as a consequence, bricking the entire cluster.

A possible fix is to change the etcd member peer address before adding a second master, like this:

/ # etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member update a874c87fd42044f --peer-urls=https://192.168.33.10:2380
Member  a874c87fd42044f updated in cluster c9be114fc2da2776

/ # etcdctl --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/peer.crt --key /etc/kubernetes/pki/etcd/peer.key member list -w table
+-----------------+---------+-----------------+----------------------------+----------------------------+
|       ID        | STATUS  |      NAME       |         PEER ADDRS         |        CLIENT ADDRS        |
+-----------------+---------+-----------------+----------------------------+----------------------------+
| a874c87fd42044f | started | demomaster1test | https://192.168.33.10:2380 | https://192.168.33.10:2379 |
+-----------------+---------+-----------------+----------------------------+----------------------------+

After doing so I was able to add a second master.

What you expected to happen?

The peer address of the first etcd should have been updated to host ip either as part of an etcd upgrade or when adding the second control plane node.

How to reproduce it (as minimally and precisely as possible)?

Adapted from the instructions at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

  1. Find or setup a 1.12 cluster.
  2. Upgrade all the way to 1.15.
  3. Add controlPlaneEndpoint with ip and port of a load balancer to kubeadm-config-file and upload to configmap in kube-system.
  4. Recreate apiserver certificates so they include load balancer ip.
  5. Restart apiserver.
  6. Upload certificates to kube-system secret.
  7. Join a second control plane node.
  8. Watch your cluster go down, never to recover again (?).

Anything else we need to know?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 26 (13 by maintainers)

Most upvoted comments

going back to this issue and the PR https://github.com/kubernetes/kubernetes/pull/86150 i would much rather have a WARNING in the “kubeadm upgrade” docs that explain how to use etcdctl to update a member that came from 1.12.

otherwise we have to backport the PR to all branches in the current support skew 1.15, 1.16, 1.17. and we cannot backport to versions older than that because they are out of support.

aside from this bug, i don’t think MemberUpdate() is currently needed and for upgrades it can be considered reconfiguration, which is something upgrades should not do.