cluster-api: Creation of workload cluster machines stuck at provisioned status. Cloud-init preflight failing to fetch the ConfigMap

/kind bug

What steps did you take and what happened: Refiling https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/624 with the Kubeadm Bootstrap provider.

What steps did you take and what happened:

  • Successfully created management cluster.
  • Successfully created workload cluster control plane.
  • Not successful in creating workload cluster workers
  • Workload cluster machines don’t progress from provisioned state to running state if there are 3 replicas.
  • Sometimes, one worker machine will progress to a running state but remaining two stay in a provisioned state.
  • No issues if only 1 replica and then scale to 3. Have not tried 2 replicas, or 4+

What did you expect to happen: All three machines to go to a running state

Anything else you would like to add:

# kubectl get machines -o wide
NAME                                                 PROVIDERID                                       PHASE         NODENAME
keithlee-capi-mgmt-cluster-controlplane-0            vsphere://42055c30-7f18-6d98-d8a5-cd00de95fd77   running       keithlee-capi-mgmt-cluster-controlplane-0
keithlee-workload-cluster-01-controlplane-0          vsphere://42052918-be7a-9e96-0f4a-555150de0f13   running       keithlee-workload-cluster-01-controlplane-0
keithlee-workload-cluster-01-md-0-759c657695-2899z   vsphere://4205ebe7-2636-d0a4-694e-08b19de37f98   provisioned
keithlee-workload-cluster-01-md-0-759c657695-97c7c   vsphere://420577c1-10a5-ea5f-8e94-5b21a318b0bf   provisioned
keithlee-workload-cluster-01-md-0-759c657695-p2vsg   vsphere://4205288a-5565-5a1f-82a9-d67e2d89235c   provisioned

Environment:

  • capi manifest: v0.5.2-beta.0
  • clusterctl: v0.2.5
  • kind: v0.5.1
  • docker: 19.0.3
  • vSphere: 6.7U3
  • ova: ubuntu-1804-kube-v1.15.4.ova

capi.log capv.log

cc @KeithRichardLee

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 22 (18 by maintainers)

Most upvoted comments

3 successful runs now complete!

First run using v0.5.2-beta.0-32-gb0cacda1 was a success. All three machines progressed to a running state. Will do another two runs.

I’m wondering if this could be related to the bootstrap token timing out, which version of CABPK is this against?