kubernetes: Error assigning spec.podCIDR to nodes

What happened: kube-controller-manager got into a state where it was not assigning a value to spec.podCIDR on newly added nodes. This occurred during an upgrade from 1.15 to 1.16.

What you expected to happen: Expected behavior would be that kube-controller-manager assigned a podCIDR

How to reproduce it (as minimally and precisely as possible): We have been unable to systematically reproduce this but we saw the same behavior on at least 3 clusters.

Anything else we need to know?:

We replace instances with new ones during the upgrade process. Here’s the sequence of events that we observed on the 3 master nodes that we run (A, B, C):

At the start of the upgrade A was the leader for kube-controller-manager
A was torn down and B became the the leader
A was upgraded from 1.15 to 1.16
B was torn down
A became the leader, running 1.16 while the remaining control plane components continued running 1.15

At this point over the next 15-20 minutes the controller manager running 1.16 showed the following error for each node in the cluster:

E0625 15:59:37.100676 1 range_allocator.go:364] Failed to update node node-8c-192-168-40-34.novalocal PodCIDR to [172.16.189.0/24] after multiple attempts: failed to patch node CIDR: Node “node-8c-192-168-40-34.novalocal” is invalid: [spec.podCIDR: Forbidden: node updates may not change podCIDR except from “” to valid, []: Forbidden: node updates may only change labels, taints, or capacity (or configSource, if the DynamicKubeletConfig feature gate is enabled)]

As workers nodes were replaced they received names that were already in the list of failed nodes and no attempt was made by the controller manager to assign a podCIDR to them, leading to failure of calico at startup.

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration: OpenStack
OS (e.g: cat /etc/os-release): NAME=“Oracle Linux Server” VERSION=“7.8” ID=“ol” ID_LIKE=“fedora” VARIANT=“Server” VARIANT_ID=“server” VERSION_ID=“7.8” PRETTY_NAME=“Oracle Linux Server 7.8” ANSI_COLOR=“0;31” CPE_NAME=“cpe:/o:oracle:linux:7:8:server” HOME_URL=“https://linux.oracle.com/” BUG_REPORT_URL=“https://bugzilla.oracle.com/”

ORACLE_BUGZILLA_PRODUCT=“Oracle Linux 7” ORACLE_BUGZILLA_PRODUCT_VERSION=7.8 ORACLE_SUPPORT_PRODUCT=“Oracle Linux” ORACLE_SUPPORT_PRODUCT_VERSION=7.8

Kernel (e.g. uname -a): 4.14.35-1902.303.5.3.el7uek.x86_64
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 23 (18 by maintainers)

Most upvoted comments

Interesting. @khenidak FYI

That error only comes from cases where someone is trying to change podCIDR. So somehow it fell into the path of needing to update the CIDR (including checking that len(node.Spec.PodCIDRs) == 0) and then finding later that this is not true.

15->16 is when podCIDR got pluralized - suspect.

Clue: “spec.podCIDR: Forbidden: node updates may not change podCIDR” - this means that the updated (pluralized) controller-manager was talking to the older, not-yet-updated apiserver (or else it would have said “spec.podCIDRs: Forbidden”. Note the “s” in “CIDRs”.

The controller logic is checking len(node.Spec.PodCIDRs) for 0. But of course it is finding that to be 0 - the apiserver doesn’t know that field. Controller is assuming the apiserver is updated, but it’s not. So controller sends a patch, which includes the singular and the plural (good job), which apiserver rejects because the singular is already set.

See updateCIDRsAllocation() in pkg/controller/nodeipam/ipam/range_allocator.go

@khenidak does this theory hold water? The fix isn’t obvious - if the apiserver isn’t updated, the controller can’t allocate the extra CIDR.

thockin on Aug 6, 2020