terraform-provider-azurerm: AKS node pool k8s version not being updated

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave “+1” or “me too” comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

terraform version: 0.12.8
azurerm provider version: 1.41

Affected Resource(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

resource "azurerm_kubernetes_cluster" "k8s" {
  name                = var.cluster_info.name
  location            = azurerm_resource_group.k8s.location
  dns_prefix          = var.cluster_info.dns_prefix
  resource_group_name = azurerm_resource_group.k8s.name
  kubernetes_version  = var.kubernetes_version

  role_based_access_control {
    enabled = ! local.aks_aad_skip_rbac
    dynamic "azure_active_directory" {
      for_each = "${! local.aks_aad_skip_rbac ? list(local.aks_rbac_setting) : []}"
      content {
        client_app_id     = local.aks_rbac_setting.client_app_id
        server_app_id     = local.aks_rbac_setting.server_app_id
        server_app_secret = local.aks_rbac_setting.server_app_secret
        tenant_id         = local.aks_rbac_setting.tenant_id
      }
    }
  }

  default_node_pool {
    name               = var.agent_pool.name
    node_count         = var.agent_pool.count
    vm_size            = "Standard_DS2_v2"
    type               = "VirtualMachineScaleSets"
    os_disk_size_gb    = 30
    max_pods           = 30
    availability_zones = local.sanitized_availability_zones

    enable_auto_scaling= true
    min_count = 3
    max_count = 12
  }

  service_principal {
    client_id     = var.aks_login.client_id
    client_secret = var.aks_login.client_secret
  }

  addon_profile {
    oms_agent {
      enabled                    = true
      log_analytics_workspace_id = azurerm_log_analytics_workspace.k8s_logs.id
    }
  }

  tags = {
    CreatedBy   = var.tags.created_by != "" ? var.tags.created_by : null
    ChangedBy   = var.tags.changed_by != "" ? var.tags.changed_by : null
    Environment = var.tags.environment != "" ? var.tags.environment : null
  }

  lifecycle {
    ignore_changes = [
      role_based_access_control, 
      role_based_access_control["azure_active_directory"],
      agent_pool_profile.0.count]
  }

  network_profile {
    network_plugin    = "azure"
    load_balancer_sku = var.aks_load_balancer_sku
  }
}

Expected Behavior

With azurerm_provider == 1.39, the kubelet version in our cluster’s node pool would be updated (in addition to the AKS k8s version) by changing var.kubernetes_version

Actual Behavior

With azurerm_provider == 1.41, the kubelet version in our cluster’s node pool is not updated (in addition to the AKS k8s version) by changing var.kubernetes_version

Important Factoids

Switching back to azurerm_provider == 1.39 and doing terraform apply seems to perform the expected behavior, even if the exact same config was previously run with azurerm_provider == 1.41.

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 61
Comments: 19 (8 by maintainers)

Commits related to this issue

Couple default_node_pool OrchestratorVersion to cluster k8s_version Per #5541, currently AKS node pool versions can never be updated. This occurred due to a change in behavior in new ARM behavior now... — committed to jstevans/terraform-provider-azurerm by jstevans 4 years ago

Most upvoted comments

Using azurerm provider at version “=2.1.0”, I’ve upgraded an azurerm_kubernetes_cluster resource from 1.14 to 1.15. The Control Plane has seemed to upgrade to 1.15, but the VMSS node pool has stayed behind at 1.14.

What is the current expected behavior Kubernetes upgrade and their node pools? I understand (https://docs.microsoft.com/en-us/azure/aks/use-multiple-node-pools#validation-rules-for-upgrades) expresses certain validation conditions.

Will the node pool upgrade when it’s obligated to? or can we control this as @jstevans suggested with an input?

+26

Exodus on Mar 19, 2020

I’d like to know this!

williamayerst on May 14, 2020

Coupling by default could work, as long as it can be disabled. Upgrading control plane and default node pool with one terraform apply could be very impactful to a cluster. I’m commenting to vote on exposing OrchestratorVersion for default node pool as well as the node pool resource.

I would rather have OrchestratorVersion exposed ASAP without the proper locking even if it means I can control it. My current plans are to use the GRAPH API directly to upgrade node pools. https://docs.microsoft.com/en-us/rest/api/aks/agentpools/createorupdate

I’m available (with Azure resources as well) to test potential patches and I am able to write golang, but I lack terraform internals experience. Let me know how I can help.

derek-burdick on Mar 4, 2020

The ruleset between control plane and agent pools are defined in this public document. There is a window of config drift you are allowed to have between the control plane and each agent pool. https://docs.microsoft.com/en-us/azure/aks/use-multiple-node-pools#upgrade-a-cluster-control-plane-with-multiple-node-pools

Rules for valid versions to upgrade node pools:

The node pool version must have the same major version as the control plane. The node pool minor version must be within two minor versions of the control plane version. The node pool version cannot be greater than the control major.minor.patch version.

Hope this helps.

jluk on Feb 3, 2020

@tombuildsstuff 🆙 any info on the above ?

carct on May 28, 2020