kubeadm: When running a single control plane node cluster, `kubeadm upgrade` hangs after printing `[addons] Applied essential addon: CoreDNS`

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): 1.15.3-00 and 1.16.7-00

Environment:

Kubernetes version (use kubectl version): Already upgraded past what it was, sorry
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release): Ubuntu 16.04.6 LTS
Kernel (e.g. uname -a): Linux debug-6 4.4.0-169-generic #198-Ubuntu SMP Tue Nov 12 10:38:00 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Others:

What happened?

Upgraded from 1.15 to 1.16 and 1.16 to 1.17 as per instructions in https://v1-16.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/ and https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/.

The upgrade hung during step (4) of https://v1-16.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#upgrading-control-plane-nodes. The last thing kubeadm upgrade printing [addons] Applied essential addon: CoreDNS`. I let it sit there for 30 minutes and then skipped ahead to step (6) and uncordoned the control plane node. After uncordoning the node, it completed within a few minutes.

What you expected to happen?

I expected it to complete within half an hour.

How to reproduce it (as minimally and precisely as possible)?

I suspect if you run a single control plane cluster and follow the control plane upgrade steps, it’ll repro. It looks like it will wait until coredns is ready. This will never happen because the single node in the cluster is drained, as per step (2) in https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/#upgrade-the-first-control-plane-node. So it waits until the timeout, which seems to initially be 5 seconds with exponential backoff up to 10 times or so. I think in aggregate that’s over an hour, which would explain why it seemed to me to hang for half an hour.

Anything else we need to know?

We’re running a single control plane cluster as per https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/, which seems to be the primary culprit here.

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 7
Comments: 43 (20 by maintainers)

Most upvoted comments

I am also getting the error, When running a single control plane cluster, kubeadm upgrade hangs after printing [addons] Applied essential addon: CoreDNS.I am trying to upgrade from v1.17.2 to v1.17.4.

anjibabueluri on Apr 12, 2020

Drained all nodes and upgrade each individually was successful

blacksailer on Apr 1, 2020

I reproduced this using a single node k8s (one node serves as both control plane and worker) when upgrading from v1.15.0 to v1.16.12. I applied the following command to apply upgrade:

apt-get update && \ apt-get install -y --allow-change-held-packages kubeadm=1.16.12-00 kubeadm upgrade plan kubectl drain test-node --ignore-daemonsets --delete-local-data kubeadm upgrade apply v1.16.12

After applyting the upgrade apply command, CLI stucked on “[addons] Applied essential addon: CoreDNS”.

Using the upgrade apply command with argument -v=10 indicates kubeadm is waiting a CoreDNS pod to start. However this is a single node k8s, the only node in it is already drained, therefore no CoreDNS can be started. I wonder if this is the root cause of the issue.

I did an experiment to prove my assumption: when the upgrade process stucks on “[addons] Applied essential addon: CoreDNS”, I uncordoned the only node in k8s, then the upgrade process finished successfully.

ethernoy on Jul 6, 2020

Not sure why this is marked as fixed since it actually requires ALL nodes to be drained not control plane only. That’s what hangs process on single node control plane. I have worker node still schedulable and hence I guess original fix was never run. This fixed it https://github.com/kubernetes/kubeadm/issues/2035#issuecomment-607330985

artisticcheese on Apr 9, 2020