kubernetes: Unable to upgrade cluster more than 2 times using kubeadm.

What happened?

I deployed a cluster of 10 nodes - 3 control-plane nodes and 7 worker nodes - and an external 3 node ETCD cluster on the 3 control-plane nodes. The initial deployment was for v1.20.13. I created a BASH script after going through Kubernetes the hard way.

I then proceeded to upgrade the cluster using another BASH script to v1.21.7. That did not go so well. Well, the upgrade completed successfully but the static pod manifests were still pointing to the old version - i.e. kubectl version would tell me that client was at v1.21.7 but server was still at v1.20.13. I then went into the /etc/kubernetes/manifests directory, moved the manifests out, modified the image: key to point to the new version, then moved them back to their proper location and afterwards both client and server were reporting the same version. Good. I edited my script to add this to avoid having to do it again and life was good until I went to upgrade to v1.22.4.

My first attempt to upgrade to v1.22.4 went as below:

ATTEMPT 1: kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.21.7
[upgrade/versions] kubeadm version: v1.22.4
[upgrade/versions] Target version: v1.22.4
[upgrade/versions] Latest version in the v1.21 series: v1.21.7

[upgrade/versions] FATAL: configmaps "kubelet-config-1.21" not found
To see the stack trace of this error execute with --v=5 or higher

I then tried something interesting:

# kubectl -n kube-system get cm kubelet-config-1.20 -oyaml | sed 's/1.20/1.21/' | kubectl apply -f -
configmap/kubelet-config-1.21 created

Obviously I do not know the implications of my actions - I’m trying stuff to see what sticks…

I then tried the upgrade plan again:

ATTEMPT 2: kubeadm upgrade plan
# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.21.7
[upgrade/versions] kubeadm version: v1.22.4
[upgrade/versions] Target version: v1.22.4
[upgrade/versions] Latest version in the v1.21 series: v1.21.7

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT        TARGET
kubelet     10 x v1.21.7   v1.22.4

Upgrade to the latest stable version:

COMPONENT                 CURRENT   TARGET
kube-apiserver            v1.21.7   v1.22.4
kube-controller-manager   v1.21.7   v1.22.4
kube-scheduler            v1.21.7   v1.22.4
kube-proxy                v1.21.7   v1.22.4
CoreDNS                   1.7.0     v1.8.4

You can now apply the upgrade by executing the following command:

        kubeadm upgrade apply v1.22.4

_____________________________________________________________________


The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.

API GROUP                 CURRENT VERSION   PREFERRED VERSION   MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io   v1alpha1          v1alpha1            no
kubelet.config.k8s.io     v1beta1           v1beta1             no
_____________________________________________________________________

What did you expect to happen?

I expected kubeadm upgrade apply to work reliably from an initial deployment of Kubernetes v1.20.13 then to v1.21.7 and finally to v1.22.4 without requiring manual intervention as seen above.

How can we reproduce it (as minimally and precisely as possible)?

Deploy a Kubernetes cluster at v1.20.13 using kubeadm, then attempt to upgrade to version v1.22.4 using kubeadm.

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4", GitTreeState:"clean", BuildDate:"2021-11-17T14:41:19Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.7", GitCommit:"1f86634ff08f37e54e8bfcd86bc90b61c98f84d4", GitTreeState:"clean", BuildDate:"2021-11-17T14:35:38Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

$ uname -a
Linux node1 5.4.0-91-generic #102-Ubuntu SMP Fri Nov 5 16:31:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

# On Windows:
N/A

Install tools

N/A

Container runtime (CRI) and and version (if applicable)

containerd://1.4.12

Related plugins (CNI, CSI, …) and versions (if applicable)

N/A

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 24 (12 by maintainers)

Most upvoted comments

@neolit123 - I’m going to close this one in the next 2 days to give you time to respond. Feel free to close as soon as your respond.

A summary of questions: Q1: What are the implications of ignoring all the preflight errors? Will this come to haunt me later on? Q2: I was able to jump from v1.19.15 to v1.20.13 skipping over v1.19.16 - does this mean there some exceptions?

Thanks to you and @pacoxu for guiding me to find my mistake. Much appreciated.