kubeadm: kubeadm upgrade apply failure with same kubernetes version

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): v1.11.2

Environment:

Kubernetes version (use kubectl version): v1.11.2
Cloud provider or hardware configuration: AWS but no cloud provider configured
OS (e.g. from /etc/os-release): Coreos 1800.6.0
Kernel (e.g. uname -a): Linux ip-172-31-35-161.eu-west-1.compute.internal 4.14.59-coreos-r2 #1 SMP Sat Aug 4 02:49:25 UTC 2018 x86_64 Intel® Xeon® CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/Linux
Others: CRI Runtime : Containerd v1.1.2

What happened?

Hi, I’m trying to upgrade my control plane with custom flags, just to test. (I’m working on an ansible solution to implement Kubeadm HA

To upgrade I’m using kubeadm upgrade diff --config kubeadm-config.yaml and then upgrade with kubeadm upgrade apply --config kubeadm-config.yaml

I’have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init) so I tries to restart the 3 component, it seems to work but It get stuck when trying to restart the scheduler, except the scheduler is running and it is waiting to acquire lease.

I think this issue might be related to https://github.com/kubernetes/kubernetes/issues/65071 and it might be because the hash is not changing because it is the same kubernetes version and no changes has been made to the pod.

Is this the proper way to modify cluster config on an already bootstrapped cluster ?

What you expected to happen?

I expect the control plane components to restart with the right flags/config added to the kubeadm config file

How to reproduce it (as minimally and precisely as possible)?

Bootstrap cluster with Kubeadm
Add a custom config flag to apiserver for example
run kubeadm upgrade --config
wait until timeout

Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]

Anything else we need to know?

When changing the version, for example I tried downgrading to 1.11.1 and the upgrade to 1.11.2, the upgrade is completed successfully.

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 5
Comments: 19 (1 by maintainers)

Commits related to this issue

kubeadm: skip upgrade if manifest is not changed When doing upgrades kubeadm generates new manifest and waits until kubelet restarts correspondent pod. However, kubelet won't restart pod if there ar... — committed to bart0sh/kubernetes by bart0sh 6 years ago
kubeadm: skip upgrade if manifest is not changed When doing upgrades kubeadm generates new manifest and waits until kubelet restarts correspondent pod. However, kubelet won't restart pod if there ar... — committed to vithati/kubernetes by bart0sh 6 years ago

Most upvoted comments

My workaround was the same (downgrade then upgrade).

ttarczynski on Jan 17, 2019

@bart0sh I’ve just managed to build the binary from PR #69886 and test it. I’ve followed these steps:

have cluster in version v1.12.1:

# kubectl version --short
Client Version: v1.12.1
Server Version: v1.12.1

# rpm -qa | egrep '^kube'
kubectl-1.12.1-2.x86_64
kubelet-1.12.1-2.x86_64
kubeadm-1.12.1-2.x86_64

put the new version of kubeadm in /tmp/kubeadm.PR69886
Used this new kubeadm to write config to file (kubeadm config view seems to be broken in v1.12.1 as mentioned in #1174)

/tmp/kubeadm.PR69886 config view > /etc/kubeadm.yaml

Run kubeadm v1.12.1 to make sure it still ends with an error:

[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
# kubeadm upgrade apply --config /etc/kubeadm.yaml 
...
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
Static pod: kube-scheduler-ksb-m1.grey hash: 2117f54c43e401f807b7c9744c2a63be
...
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]

Tested with kubeadm version from PR #69886

# /tmp/kubeadm.PR69886 upgrade apply --config /etc/kubeadm.yaml --force
...
[upgrade/staticpods] current and new manifests of kube-apiserver are equal, skipping upgrade
...
[upgrade/staticpods] current and new manifests of kube-scheduler are equal, skipping upgrade
...
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.12.1". Enjoy!

So it seems to me that PR #69886 does fix this issue.

ttarczynski on Oct 18, 2018

@ArchiFleKs

I’have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init)

This should be fixed by this PR

bart0sh on Oct 19, 2018

@ttarczynski can you try if this PR solves the issue for you?

bart0sh on Oct 18, 2018

@ttarczynski great. Thanks for that. I think we found a different regression because of this 😃

rdodev on Oct 17, 2018