cluster-api-provider-azure: Agentpool is unable to update machinepool state in non-terminal state, resulting in cluster not being ready when running command

What steps did you take and what happened: I created a CAPI cluster, using clusterctl, using a kind cluster for bootstrap and created a azure managed cluster with 2 machinepools. When I run kubectl get cluster <cluster_name>, I see that cluster is provisioned. I am able to dump the kubeconfig and run commands on the newly created cluster, so no issues with that. However, when I run watch clusterctl describe cluster <cluster_name> command, I see that the READY state for the Machinepool flips between True and False. When the Ready state is False, there are no error logs pertaining to it in the capz-controller-manager and capi-controller-manager pods. Describing the Machinepool object shows the following:

Status:
  Available Replicas:  2
  Bootstrap Ready:     true
  Conditions:
    Last Transition Time:  2023-01-04T01:19:31Z
    Message:               agentpools failed to create or update. err: failed to get desired parameters for resource cluster-sample/pool1 (service: agentpools): Unable to update existing agent pool in non terminal state. Agent pool must be in one of the following provisioning states: Canceled, Failed, or Succeeded. Actual state: Updating. Object will be requeued after 20s
    Reason:                Failed
    Severity:              Error
    Status:                False
    Type:                  Ready
    Last Transition Time:  2023-01-03T22:25:00Z
    Status:                True
    Type:                  BootstrapReady
    Last Transition Time:  2023-01-04T01:19:31Z
    Message:               agentpools failed to create or update. err: failed to get desired parameters for resource cluster-sample/pool1 (service: agentpools): Unable to update existing agent pool in non terminal state. Agent pool must be in one of the following provisioning states: Canceled, Failed, or Succeeded. Actual state: Updating. Object will be requeued after 20s
    Reason:                Failed
    Severity:              Error
    Status:                False
    Type:                  InfrastructureReady
    Last Transition Time:  2023-01-03T22:25:00Z
    Status:                True
    Type:                  ReplicasReady
  Infrastructure Ready:    false
  Node Refs:
    Name:               aks-pool1-86783568-vmss000000
    UID:                a65af963-e4f7-47d5-b96d-c33d5572a8e9
    Name:               aks-pool1-86783568-vmss000001
    UID:                0e2077cb-1507-4ebe-aef4-b2113769b5ed
  Observed Generation:  2
  Phase:                Provisioned
  Ready Replicas:       2
  Replicas:             2

What did you expect to happen: The clusterctl command should show the correct status. There is no issue with the newly created cluster, and it should report the status as True and must not flip values between True and False at random intervals.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version: 1.6.0
  • minikube/kind version: v0.17.0
  • Kubernetes version: (use kubectl version): v1.23.12
  • OS (e.g. from /etc/os-release): “CentOS Linux 7 (Core)”

/kind bug area/clusterctl , area/bootstrap

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 28 (21 by maintainers)

Commits related to this issue

Most upvoted comments

cc @CecileRobertMichon (might make sense to move the issue to the CAPZ repo)