cluster-api: Mutating a control plane KubeadmConfig triggers indefinite rolling upgrade
What steps did you take and what happened:
- Configure a mutating admission webhook that modifies the KubeadmConfig for control plane machines.
- Observe that control plane machines will indefinitely try to upgrade.
What did you expect to happen:
Since matchInitOrJoinConfiguration checks against the entire KubeadmConfig Spec, any mutations applied for a KubeadmConfig (such as adding custom files or preKubeadmCommands), will trigger a rolling upgrade of all control plane machines. This behavior does not occur when mutating KubeadmConfig resources created from KubeadmConfigTemplate and likewise I would not expect this behavior when mutating a KubeadmConfig derived from a KubeadmControlPlane. To add another example, Kubernetes would not do a rolling upgrade of a Deployment if the Pods derived from that were mutated in any way (e.g. sidecars for service meshes). A rolling upgrade is only triggered when there is a change in the pod template.
A workaround can be to mutate KubeadmControlPlane in the same way KubeadmConfig is mutated, but if any mutations are dynamic or unique per machine, KubeadmControlPlane will continuously try to perform a rolling upgrade of the control plane machines since it expects KubeadmControlPlane and KubeadmConfig to match exactly.
matchInitOrJoinConfiguration should only match against fields that should trigger a rolling update instead of all of fields in KubeadmConfigSpec or more ideally, a hash of the last defined KubeadmConfig spec should be stored in KubeadmControlPlane to know when a rolling update should occur. This is the same mechanism used to determine whether a rolling update of a Deployment should occur.
Anything else you would like to add:
Environment:
- Cluster-api version:
- Minikube/KIND version:
- Kubernetes version: (use
kubectl version): - OS (e.g. from
/etc/os-release):
/kind bug [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (17 by maintainers)
Agreed, I’ll file one
Fine with me, just giving my 2 cents that this was pretty unexpected. As @yastij mentioned above, this disallows any mutations that end up being specific to a machine.
as a reference, this issue https://github.com/kubernetes-sigs/cluster-api/issues/3170 triggered us moving away from an hash mechanism to the current approach