cluster-api: CAPI: workload cluster made management cluster doesn't work

What steps did you take and what happened: Following the steps described on https://cluster-api.sigs.k8s.io/clusterctl/commands/move.html: Part 1:

Create an initial (/bootstrap) cluster with Kind
Install the provider components (Docker) into this bootstrap cluster
Provision a new ‘master’ management cluster using CAPI
Get this ‘master’ management cluster up and running correctly (also Docker provider components)
Output of clusterctl describe:

<font color="#4E9A06">robin@robin-GL552VW</font>:<font color="#3465A4">/mnt/files/Documents/School/Thesis/Repo/thesis_ugent/system_setup</font>$ clusterctl describe cluster master-cluster
NAME                                                               READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/master-cluster                                             <font color="#4E9A06">True</font>                     2m12s         
<font color="#555753">├─</font>ClusterInfrastructure - <font color="#555753">DockerCluster/</font><font color="#555753">master-cluster</font>             <font color="#4E9A06">True</font>                     2m48s         
<font color="#555753">├─</font>ControlPlane - <font color="#555753">KubeadmControlPlane/</font><font color="#555753">master-cluster-control-plane</font>  <font color="#4E9A06">True</font>                     2m12s         
<font color="#555753">│ └─</font>Machine/master-cluster-control-plane-xngls                     <font color="#4E9A06">True</font>                     2m13s         
<font color="#555753">└─</font>Workers                                                                                                 
<font color="#555753">  └─</font>MachineDeployment/master-cluster-md-0                          <font color="#4E9A06">True</font>                     15s           
<font color="#555753">    └─</font>Machine/master-cluster-md-0-6968889875-n7bmm                 <font color="#4E9A06">True</font>                     105s

Part 2:

Use clusterctl init on this ‘master’ management cluster
Output of clusterctl describe:

<font color="#4E9A06">robin@robin-GL552VW</font>:<font color="#3465A4">/mnt/files/Documents/School/Thesis/Repo/thesis_ugent/system_setup</font>$ clusterctl describe cluster master-cluster
NAME                                                               READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/master-cluster                                             <font color="#4E9A06">True</font>                     8m            
<font color="#555753">├─</font>ClusterInfrastructure - <font color="#555753">DockerCluster/</font><font color="#555753">master-cluster</font>             <font color="#4E9A06">True</font>                     8m36s         
<font color="#555753">├─</font>ControlPlane - <font color="#555753">KubeadmControlPlane/</font><font color="#555753">master-cluster-control-plane</font>  <font color="#4E9A06">True</font>                     8m            
<font color="#555753">│ └─</font>Machine/master-cluster-control-plane-xngls                     <font color="#4E9A06">True</font>                     8m1s          
<font color="#555753">└─</font>Workers                                                                                                 
<font color="#555753">  └─</font>MachineDeployment/master-cluster-md-0                          <font color="#4E9A06">True</font>                     6m3s          
<font color="#555753">    └─</font>Machine/master-cluster-md-0-6968889875-n7bmm                 <font color="#4E9A06">True</font>                     7m33s

Use clusterctl move to make this newly created ‘master’ cluster my main management cluster:

<font color="#4E9A06">robin@robin-GL552VW</font>:<font color="#3465A4">/mnt/files/Documents/School/Thesis/Repo/thesis_ugent/system_setup</font>$ clusterctl move --to-kubeconfig master-cluster.kubeconfig
Performing move...
Discovering Cluster API objects
Moving Cluster API objects Clusters=1
Moving Cluster API objects ClusterClasses=0
Creating objects in the target cluster
Deleting objects from the source cluster

The ‘master’ cluster is still operating, but can’t create any new resources. Also the output of clusterctl describe has changed. (clusterctl describe now uses the kubectl of the new ‘master’ cluster to communicate with the cluster).

<font color="#4E9A06">robin@robin-GL552VW</font>:<font color="#3465A4">/mnt/files/Documents/School/Thesis/Repo/thesis_ugent/system_setup</font>$ KUBECONFIG=$(pwd)/master-cluster.kubeconfig clusterctl describe cluster master-cluster
NAME                                                               READY  SEVERITY  REASON                           SINCE  MESSAGE
Cluster/master-cluster                                             <font color="#D3D7CF">False</font>  <font color="#D3D7CF">Info</font>      <font color="#D3D7CF">WaitingForControlPlane</font>           74s           
<font color="#555753">├─</font>ClusterInfrastructure - <font color="#555753">DockerCluster/</font><font color="#555753">master-cluster</font>                                                                             
<font color="#555753">├─</font>ControlPlane - <font color="#555753">KubeadmControlPlane/</font><font color="#555753">master-cluster-control-plane</font>                                                                  
<font color="#555753">│ └─</font>Machine/master-cluster-control-plane-xngls                     <font color="#D3D7CF">False</font>  <font color="#D3D7CF">Info</font>      <font color="#D3D7CF">WaitingForClusterInfrastructure</font>  74s           
<font color="#555753">└─</font>Workers                                                                                                                          
<font color="#555753">  └─</font>MachineDeployment/master-cluster-md-0                          <font color="#4E9A06">True</font>                                              74s           
<font color="#555753">    └─</font>Machine/master-cluster-md-0-6968889875-n7bmm                 <font color="#D3D7CF">False</font>  <font color="#D3D7CF">Info</font>      <font color="#D3D7CF">WaitingForClusterInfrastructure</font>  74s

No new resources get provisioned for new machine deployments or original upscaling machine deployments. It’s like the cluster is perfectly working but can’t expand. Edit: provisioning a cluster from this ‘master’ management cluster will also fail. No new clusters can be provisioned.

What did you expect to happen: The output of clusterctl describe to be the same as before the clusterctl move command. New resources being created when creating machine deployments or upscaling existing machine deployments.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Edit: al lot of information is provided in the comments below. Look for the messages where I provide an extensive description of an approach, the results and the command outputs.

kind-management-cluster.yaml (Click to expand)

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: management-cluster
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /var/run/docker.sock
      containerPath: /var/run/docker.sock

master-cluster.yaml (Click to expand)

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: master-cluster
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    serviceDomain: cluster.local
    services:
      cidrBlocks:
      - 10.128.0.0/12
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: master-cluster-control-plane
    namespace: default
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: DockerCluster
    name: master-cluster
    namespace: default
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerCluster
metadata:
  name: master-cluster
  namespace: default
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: master-cluster-control-plane
  namespace: default
spec:
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        certSANs:
        - localhost
        - 127.0.0.1
        - 0.0.0.0
      controllerManager:
        extraArgs:
          enable-hostpath-provisioner: "true"
    initConfiguration:
      nodeRegistration:
        criSocket: /var/run/containerd/containerd.sock
        kubeletExtraArgs:
          cgroup-driver: cgroupfs
          eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
    joinConfiguration:
      nodeRegistration:
        criSocket: /var/run/containerd/containerd.sock
        kubeletExtraArgs:
          cgroup-driver: cgroupfs
          eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
  machineTemplate:
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerMachineTemplate
      name: master-cluster-control-plane
      namespace: default
  replicas: 1
  version: v1.21.1
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: master-cluster-control-plane
  namespace: default
spec:
  template:
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: master-cluster-md-0
  namespace: default
spec:
  template:
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: master-cluster-md-0
  namespace: default
spec:
  template:
    spec:
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            cgroup-driver: cgroupfs
            eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: master-cluster-md-0
  namespace: default
spec:
  clusterName: master-cluster
  replicas: 1
  selector:
    matchLabels: null
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: master-cluster-md-0
          namespace: default
      clusterName: master-cluster
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        name: master-cluster-md-0
        namespace: default
      version: v1.21.1

Environment:

Cluster-api version: GitVersion:"v1.1.3"
KIND version: kind v0.11.1 go1.17.4 linux/amd64
Kubernetes version: (use kubectl version):
- Client: GitVersion:"v1.22.3"
- Server, both Kind cluster and ‘master’ cluster have version v1.21.1
Docker version: 20.10.13
OS (e.g. from /etc/os-release): Ubuntu 20.04.4 LTS

/kind bug /area provider/docker [One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 34 (34 by maintainers)

Most upvoted comments

I’m here with probably the last update, it finally works!

In consultation with @VerwaerdeWim, he managed to find a solution. In the meantime the pull request (https://github.com/kubernetes-sigs/cluster-api/pull/6473) has already been approved and the changes will be available next update of Cluster API. I’ll list the problem cause and solution below, followed by additional comments.

Explanation

There are two parts to the cause:

1) Access to Docker daemon

Firstly, there is no access to the Docker daemon from inside the capd-controller-manager. This was already mentioned in https://github.com/kubernetes-sigs/cluster-api/issues/6321#issuecomment-1110540975. There, it was suggested to add the mounts to the Docker MachineTemplate of the worker nodes instead of only the control plane nodes.

    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock

As mentioned in https://github.com/kubernetes-sigs/cluster-api/issues/6321#issuecomment-1112352383 this got added to the templates through pull request https://github.com/kubernetes-sigs/cluster-api/pull/6460/commits/9fc3960df1e725242f15d6552203b76df474220a.

However, when adding these two lines to the YAML files generated by clusterctl generate, the setup fails even earlier. I already mentioned this in the second half of https://github.com/kubernetes-sigs/cluster-api/issues/6321#issuecomment-1113270985 .

2) Setting the default container runtime

The reason for this is mostly explained in in the pull request https://github.com/kubernetes-sigs/cluster-api/pull/6473 .

We have to set the criSocket to containerd as kubeadm defaults to docker runtime if both containerd and docker sockets are found

So by mounting Docker to the container, kubeadm will default to the Docker runtime, but this is not desired as they use containerd. To fix this, the KubeadmConfigTemplate should be expanded with the following line:

          criSocket: unix:///var/run/containerd/containerd.sock

This line could already be found in the templates for the control plane, but not for the worker nodes.

Example fix for YAML files generated with `clusterctl generate` prior to Cluster API v1.1.4

Add the socket to the worker nodes:

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  name: master-cluster-md-0
  namespace: default
spec:
  template:
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock

Set the criSocket

apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: master-cluster-md-0
  namespace: default
spec:
  template:
    spec:
      joinConfiguration:
        nodeRegistration:
          # We have to set the criSocket to containerd as kubeadm defaults to docker runtime if both containerd and docker sockets are found
          criSocket: /var/run/containerd/containerd.sock
          kubeletExtraArgs:
            cgroup-driver: cgroupfs
            eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%

Other comments

This problem was already discovered for Kubernetes 1.24

In the changelog of Kubernetes 1.24 an entry is made regarding this issue:

Kubeadm: default the kubeadm configuration to the containerd socket (Unix: unix:///var/run/containerd/containerd.sock, Windows: npipe:////./pipe/containerd-containerd) instead of the one for Docker. If the Init|JoinConfiguration.nodeRegistration.criSocket field is empty during cluster creation and multiple sockets are found on the host always throw an error and ask the user to specify which one to use by setting the value in the field. Make sure you update any kubeadm configuration files on disk, to not include the dockershim socket unless you are still using kubelet version < 1.24 with kubeadm >= 1.24. Remove the DockerValidor and ServiceCheck for the docker service from kubeadm preflight. Docker is no longer special cased during host validation and ideally this task should be done in the now external cri-dockerd project where the importance of the compatibility matters. Use crictl for all communication with CRI sockets for actions like pulling images and obtaining a list of running containers instead of using the docker CLI in the case of Docker. (https://github.com/kubernetes/kubernetes/pull/107317, @neolit123)

It wouldn’t be a fix, but would probably result in a clearer error message about the issue.

Using the template

During testing and trying to find a fix for the problem, it was suggested here https://github.com/kubernetes-sigs/cluster-api/issues/6321#issuecomment-1108471073 to use the template for e2e testing. As there is no extra documentation on how to use these except from the developer guide it’s hard to make it work. Only after the issue was fixed, I learned from @VerwaerdeWim that knowledge of kubeadm is necessary to know what values to choose for the many variables in the YAML file.

There are no obvious error messages when the variables are not filled in, but it simply does not work, as I discovered in https://github.com/kubernetes-sigs/cluster-api/issues/6321#issuecomment-1112284761. Furthermore, there is no listing of these variables, their usefulness, or possible examples (e.g. a ReadMe listing them all). A brief explanation for some variables can be found by diving into the file, but it is often limited and lacks a concrete example.

Users without a deep understanding of kubeadm (like me) will use Cluster API because it simplifies cluster lifecycle management. If more people use these templates, error resolution will be faster because differences in setup will no longer be a factor.

Thanks

Lastly, I’d like to express my thanks to @fabriziopandini for instantly replying on Slack and GitHub, @chrischdi for offering help on Slack and @sbueringer for helping here on GitHub. Special thanks to @VerwaerdeWim for finding a solution, I already told you I owe you one 😉

RobinDeBock on May 5, 2022

@RobinDeBock thanks for the detailed report about your discovery, really appreciated it!

fabriziopandini on May 8, 2022

@RobinDeBock WDYT about taking a look together via a Zoom session? I’m not sure how soon I’ll be able to make some time, but you can ping my in the Kubernetes Slack (sbueringer)

sbueringer on Apr 28, 2022

@sbueringer let’s add the mount also for workers, it won’t hurt… WRT to the problem at stake, have we considered to run/debug our self-hosted E2E test locally

fabriziopandini on Apr 28, 2022