kubeadm: kubeadm upgrade apply failure with same kubernetes version

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): v1.11.2

Environment:

  • Kubernetes version (use kubectl version): v1.11.2
  • Cloud provider or hardware configuration: AWS but no cloud provider configured
  • OS (e.g. from /etc/os-release): Coreos 1800.6.0
  • Kernel (e.g. uname -a): Linux ip-172-31-35-161.eu-west-1.compute.internal 4.14.59-coreos-r2 #1 SMP Sat Aug 4 02:49:25 UTC 2018 x86_64 Intel® Xeon® CPU E5-2686 v4 @ 2.30GHz GenuineIntel GNU/Linux
  • Others: CRI Runtime : Containerd v1.1.2

What happened?

Hi, I’m trying to upgrade my control plane with custom flags, just to test. (I’m working on an ansible solution to implement Kubeadm HA

To upgrade I’m using kubeadm upgrade diff --config kubeadm-config.yaml and then upgrade with kubeadm upgrade apply --config kubeadm-config.yaml

I’have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init) so I tries to restart the 3 component, it seems to work but It get stuck when trying to restart the scheduler, except the scheduler is running and it is waiting to acquire lease.

I think this issue might be related to https://github.com/kubernetes/kubernetes/issues/65071 and it might be because the hash is not changing because it is the same kubernetes version and no changes has been made to the pod.

Is this the proper way to modify cluster config on an already bootstrapped cluster ?

What you expected to happen?

I expect the control plane components to restart with the right flags/config added to the kubeadm config file

How to reproduce it (as minimally and precisely as possible)?

  • Bootstrap cluster with Kubeadm
  • Add a custom config flag to apiserver for example
  • run kubeadm upgrade --config
  • wait until timeout
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
Static pod: kube-scheduler-ip-172-31-35-161.eu-west-1.compute.internal hash: a00c35e56ebd0bdfcd77d53674a5d2a1                                                       
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]

Anything else we need to know?

When changing the version, for example I tried downgrading to 1.11.1 and the upgrade to 1.11.2, the upgrade is completed successfully.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 5
  • Comments: 19 (1 by maintainers)

Commits related to this issue

Most upvoted comments

My workaround was the same (downgrade then upgrade).

@bart0sh I’ve just managed to build the binary from PR #69886 and test it. I’ve followed these steps:

  1. have cluster in version v1.12.1:
# kubectl version --short
Client Version: v1.12.1
Server Version: v1.12.1

# rpm -qa | egrep '^kube'
kubectl-1.12.1-2.x86_64
kubelet-1.12.1-2.x86_64
kubeadm-1.12.1-2.x86_64
  1. put the new version of kubeadm in /tmp/kubeadm.PR69886
  2. Used this new kubeadm to write config to file (kubeadm config view seems to be broken in v1.12.1 as mentioned in #1174)
/tmp/kubeadm.PR69886 config view > /etc/kubeadm.yaml
  1. Run kubeadm v1.12.1 to make sure it still ends with an error:
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
# kubeadm upgrade apply --config /etc/kubeadm.yaml 
...
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
Static pod: kube-scheduler-ksb-m1.grey hash: 2117f54c43e401f807b7c9744c2a63be
...
[upgrade/apply] FATAL: couldn't upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [timed out waiting for the condition]
  1. Tested with kubeadm version from PR #69886
# /tmp/kubeadm.PR69886 upgrade apply --config /etc/kubeadm.yaml --force
...
[upgrade/staticpods] current and new manifests of kube-apiserver are equal, skipping upgrade
...
[upgrade/staticpods] current and new manifests of kube-scheduler are equal, skipping upgrade
...
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.12.1". Enjoy!

So it seems to me that PR #69886 does fix this issue.

@ArchiFleKs

I’have only added a flag to kube-api-server for now, I can see it in the diff but kubeadm still tries to restart controller and scheduler and also show me some diff between manifest (basicly moving volumeMount up and down the file comparing to the file generated by kubeadm init)

This should be fixed by this PR

@ttarczynski can you try if this PR solves the issue for you?

@ttarczynski great. Thanks for that. I think we found a different regression because of this 😃