terraform-provider-azurerm: Updates to azurerm_kubernetes_cluster fail when cluster uses managed AAD integration

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave “+1” or “me too” comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.26

  • provider.azurerm v2.14.0

Affected Resource(s)

  • azurerm_kubernetes_cluster

Terraform Configuration Files

resource "azurerm_resource_group" "aks" {
  name     = "aks-service-rg"
  location = "northeurope"
}

resource "azurerm_kubernetes_cluster" "aks" {
  name                            = "aks-service"
  location                        = azurerm_resource_group.aks.location
  resource_group_name             = azurerm_resource_group.aks.name
  node_resource_group             = "aks-infra-rg"
  dns_prefix                      = "aks-dev"
  enable_pod_security_policy      = false
  private_cluster_enabled         = false
  api_server_authorized_ip_ranges = null
 
  default_node_pool {
    name            = "default"
    node_count      = 4
    vm_size         = "Standard_B2ms"
    os_disk_size_gb = 30
    vnet_subnet_id  = var.virtual_network.subnets.aks.id
    max_pods        = 60
    type            = "VirtualMachineScaleSets"
  }

  linux_profile {
    admin_username = var.admin_username

    ssh_key {
      key_data = tls_private_key.aks.public_key_openssh
    }
  }
  
  role_based_access_control {
    enabled = true

    azure_active_directory {
      managed                 = true
      admin_group_object_ids  = [for key, value in local.cluster_admins : value.object_id] 
    }
  }

  identity {
    type    = "SystemAssigned"
  }

  addon_profile {

    azure_policy {
      enabled = true
    }
    
    oms_agent {
      enabled                    = true
      log_analytics_workspace_id = var.log_analytics_workspace.id
    }

    kube_dashboard {
      enabled = true
    }

    http_application_routing {
      enabled = false
    }

  }

  network_profile {
    network_plugin     = "azure"
    network_policy     = "azure"
    load_balancer_sku  = "Basic"
    service_cidr       = var.kubernetes_service_cidr
    docker_bridge_cidr = var.docker_bridge_cidr
    dns_service_ip     = cidrhost(var.kubernetes_service_cidr, 2)
  }

  tags = local.tags

}

Debug Output

Panic Output

Expected Behavior

  • Enable feature ‘Microsoft.ContainerService/AAD-V2’ on subscription
  • Apply plan to create cluster with managed Azure Active Directory integration
  • Change value of tags - or any other argument that doesn’t necessitate a replacement of the resource
  • Run terraform plan
  • Apply plan
  • Tags are updated to reflect changes

Actual Behavior

  • Enable feature ‘Microsoft.ContainerService/AAD-V2’ on subscription
  • Apply plan to create cluster with managed Azure Active Directory integration
  • Change value of tags - or any other argument that doesn’t necessitate a replacement of the resource
  • Run terraform plan
  • Apply plan
  • Apply fails with error: -

Error: updating Managed Kubernetes Cluster AAD Profile in cluster “aks-service” (Resource Group “aks-service-rg”): containerservice.ManagedClustersClient#ResetAADProfile: Failure sending request: StatusCode=400 – Original Error: Code=“BadRequest” Message=“Operation ‘resetAADProfile’ is not allowed for managed AAD enabled cluster.”

Steps to Reproduce

  1. Register feature ‘Microsoft.ContainerService/AAD-V2’ on subscription as per https://docs.microsoft.com/en-us/azure/aks/managed-aad
  2. terraform plan
  3. terraform apply
  4. Make changes to resource
  5. terraform plan
  6. terraform apply

Important Factoids

References

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 35
  • Comments: 35 (5 by maintainers)

Most upvoted comments

I’ve implemented a fix and added Acceptance tests to cover the scenarios in this issue.

If nothing goes wrong it will make next release! 🎉

@tombuildsstuff or anyone can we mayber get this into the next release as a fix? currently it blocks from using the feature as updates to the cluster makes this break.

This has been released in version 2.21.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.21.0"
}
# ... other configuration ...

The error message also appears when trying to update Kubernetes version. There are too many issues to even consider a reliable work around, so feature is unusable. Just had to revert to using service principle unfortunately.

This error also occurs when modifying other properties of the cluster such as the max node count on a node pool

      ~ default_node_pool {
            availability_zones    = []
            enable_auto_scaling   = true
            enable_node_public_ip = false
          ~ max_count             = 3 -> 4
            max_pods              = 30
            min_count             = 3
            name                  = "default"
            node_count            = 3
            node_labels           = {}
            node_taints           = []
            orchestrator_version  = "1.17.7"
            os_disk_size_gb       = 30
            tags                  = {}
            type                  = "VirtualMachineScaleSets"
            vm_size               = "Standard_DS3_v2"
            vnet_subnet_id        = "/subscriptions/xxxxxxxxxxxxxxxxx/resourceGroups/rg-pegaplatform-network-sbox-canadacentral-persistent/providers/Microsoft.Network/virtualNetworks/vnet-pegaplatform-network-sbox-canadacentral/subnets/Private"
        }

....


Plan: 0 to add, 1 to change, 0 to destroy.

error:

Error: updating Managed Kubernetes Cluster AAD Profile in cluster "aks-pegaplatform-sbox-canadacentral" (Resource Group "rg-pegaplatform-sbox-canadacentral-persistent"): containerservice.ManagedClustersClient#ResetAADProfile: Failure sending request: StatusCode=400 -- Original Error: Code="BadRequest" Message="Operation 'resetAADProfile' is not allowed for managed AAD enabled cluster."

  on main.tf line 141, in resource "azurerm_kubernetes_cluster" "aks_cluster":
 141: resource "azurerm_kubernetes_cluster" "aks_cluster" {

@aristosvo I did just like you wrote, I upgraded to AAD-v2 with registering feature.

// I registered AAD-V2 feature az feature register --name AAD-V2 --namespace Microsoft.ContainerService // created AD group for AKS az ad group create --display-name myAKSAdmin --mail-nickname myAKSAdmin // added myself to group az ad group member add --group myAKSAdmin --member-id $id

// Updated cluster groupid=$(az ad group show --groupmyAKSAdmin --query objectId --output tsv) tenantid=$(az account show --query tenantId --output tsv) az aks update -g myaks-rg -n myaks-aks --aad-tenant-id $tenantid --aad-admin-group-object-ids $groupid

I somehow think that terraform can query if AAD is used 😃 My mistake.

My configuration now (I have to uncomment SP’s)

role_based_access_control { enabled = true azure_active_directory { managed = true // optional: admin_group_object_ids = [“myAKSAdmin_groupID_not_text”]
#client_app_id = var.aad_client_app_id #server_app_id = var.aad_server_app_id #server_app_secret = var.aad_server_app_secret tenant_id = var.aad_tenant_id } }

@sutinse1 Can you explain in short what you did before you ended up with the mentioned error?

What I think you did was as follows:

  • Create the azurerm_kubernetes_cluster with the setup from the course:
resource "azurerm_kubernetes_cluster" "demo" {
...
  role_based_access_control {
    enabled = true

    azure_active_directory {
      client_app_id     = var.aad_client_app_id
      server_app_id     = var.aad_server_app_id
      server_app_secret = var.aad_server_app_secret
      tenant_id         = var.aad_tenant_id
    }
  }
...
}
  • You probably upgraded it to AAD-v2 via commandline az aks update -g myResourceGroup -n myManagedCluster --enable-aad or similar.
  • You reapplied the old configuration with Terraform.

If not, I’m very curious how your configuration ended up in the state with the error 😄

EDIT: This is working fine now, it was my loosing configuration. Thanks aristosvo!

So I added like instructed to main main.tf

  managed                 = true
  // optional:
  admin_group_object_ids  = ["myAksAdminId_NOT_group_name"]
  # these have to comment out  
  #client_app_id     = var.aad_client_app_id
  #server_app_id     = var.aad_server_app_id
  #server_app_secret = var.aad_server_app_secret
  tenant_id         = var.aad_tenant_id

WORKED!

I get still error about ResetAADProfile althoug I used v2.21.0 azurerm provider.

Error: updating Managed Kubernetes Cluster AAD Profile in cluster “sutinenseaks-aks” (Resource Group “sutinenseaks-rg”): containerservice.ManagedClustersClient#ResetAADProfile: Failure sending request: StatusCode=400 – Original Error: Code=“BadRequest” Message=“Operation ‘resetAADProfile’ is not allowed for managed AAD enabled cluster.”

on main.tf line 45, in resource “azurerm_kubernetes_cluster” “demo”: 45: resource “azurerm_kubernetes_cluster” “demo” {

I up terraform.zip graded azurerm provider to 2.21.0 terraform init -upgrade

Upgraded also kubernetes provider 1.11.1 -> 1.12.0, not still working

terraform version Terraform v0.13.0

  • provider registry.terraform.io/hashicorp/azurerm v2.21.0
  • provider registry.terraform.io/hashicorp/github v2.4.1
  • provider registry.terraform.io/hashicorp/kubernetes v1.12.0
  • provider registry.terraform.io/hashicorp/tls v2.1.0

My try was done according that tutorial https://github.com/Azure/sg-aks-workshop

@tkinz27 your talking about two different things here. The managed AAD integration this issue refers to is related to being able to login to the cluster for admin work as an AAD user, has nothing to do with the clusters access to other resources.

Using managed identity for the cluster identity creates a user assigned managed identity which you can retrieve the name of using the “user_assigned_identity_id” of the “kubelet_identity” block. you would then grant this managed identity access to ACR.

Upgrading provider to 2.21.0 version works 😃

resetAADProfile with API version 2020-06-01 seems to support enableAzureRBAC: https://docs.microsoft.com/en-us/rest/api/aks/managedclusters/resetaadprofile#request-body

So I guess this could be fixed by using the new API version.

Week late on this buuutt… me and a colleague had same error yesterday. We noticed you could update the rbac details via cli so for anyone that wants a workaround while this is being looked at: we deleted the aks cluster, set the role_based_access_control block to

role_based_access_control {
    enabled = true
    azure_active_directory {
      managed = true
    }
}

then created a null resource where we update the managed admin ids

resource "null_resource" "update_admin_group_ids" {
  depends_on = [
    azurerm_kubernetes_cluster.aks
  ]
  provisioner "local-exec" {
    command = <<EOT
      # --update ids
      az aks update -g <resource_group> -n <name> --aad-tenant-id <tenant_id> --aad-admin-group-object-ids <admin_group_ids>
   EOT
  }
}

However, you’ll also need a ignore_change on the aks rbac block

lifecycle {
    ignore_changes = [
      role_based_access_control
    ]
  }

az version: 2.8 azurerm_provider version: 2.15

EDIT: if tags change, it still raises the resetAADProfile error. You can add this to the ignore if that works for you, but obviously you can’t update tags (big disadvantage). Unfortunately, there is no az aks update tags options either. Investigating using az resource tag