kubeone: Cluster autoscaling - new nodes cannot join control plane

What happened?

I followed the guide to spin up a kube one cluster with 3 control plane nodes, 2 worker nodes and then I also enabled autoscaling and it worked ok - I tested it

Then a few days later when I scaled it up (by increasing a deployment replicas) new machines are created, but they can’t join control plane.

Seems that is the case in 2 clusters - it’s fine on the first day, then later it won’t work Note: this is on hetzner Log says

Sep 16 10:47:06 dev-pool1-77c8b59b86-9xrgs kubelet[6923]: I0916 10:47:06.652377    6923 kubelet_node_status.go:70] "Attempting to register node" node="dev-pool1-77c8b59b86-9xrgs"


Sep 16 10:47:06 dev-pool1-77c8b59b86-9xrgs kubelet[6923]: E0916 10:47:06.655857    6923 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes is forbidden: User \"system:anonymous\" cannot create resource \"nodes\" in API group \"\" at the cluster scope" node="dev-pool1-77c8b59b86-9xrgs"

And that log just repeats with various flavours / components

Deleting the MachineDeployment and MachineSet and re-creating the MachineDeployment usually fixes it - but not really a practical solution I’m happy to investigate further, if you could point me in the right direction

Expected behavior

Auto scaling to continue working

How to reproduce the issue?

Spin up cluster Setup auto scaling Go away for 1-2 days Come back and try to scale

What KubeOne version are you using?

$ kubeone version
{
  "kubeone": {
    "major": "1",
    "minor": "5",
    "gitVersion": "1.5.0-beta.0",
    "gitCommit": "3800c3d9d244f76d915bb05aa26e45acafc32158",
    "gitTreeState": "",
    "buildDate": "2022-08-04T14:03:44Z",
    "goVersion": "go1.19",
    "compiler": "gc",
    "platform": "linux/amd64"
  },
  "machine_controller": {
    "major": "1",
    "minor": "53",
    "gitVersion": "v1.53.0",
    "gitCommit": "",
    "gitTreeState": "",
    "buildDate": "",
    "goVersion": "",
    "compiler": "",
    "platform": "linux/amd64"
  }
}

Provide your KubeOneCluster manifest here (if applicable)

apiVersion: kubeone.k8c.io/v1beta2
kind: KubeOneCluster
versions:
  kubernetes: '1.24.4'
cloudProvider:
  hetzner: {}
  external: true
addons:
  enable: true
  addons:
  - name: cluster-autoscaler
  - name: unattended-upgrades

What cloud provider are you running on?

What operating system are you running in your cluster?

Ubuntu 20.04

Additional information

Happy to help investigate this further, but unsure where to look next

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 32 (12 by maintainers)

Most upvoted comments

I am experiencing the same on Hetzner, 1.5.0. Got it to scale by doing a rolling restart of the machinedeployment. Not sure if it works after that (not autoscaling usually very often).

oh good, so I’m not a total idiot 😄