terraform-provider-azurerm: Getting an intermittent failed to refresh Bearer token error when trying to delete my AKS cluster

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave “+1” or “me too” comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.11.10 AzureRM Provider v1.20.0

Affected Resource(s)

  • azurerm_kubernetes_cluster

Terraform Configuration Files

resource "azurerm_kubernetes_cluster" "aks_cluster" {
  name       = "${var.name}"
  location   = "${var.region}"
  dns_prefix = "${var.name}"

  kubernetes_version  = "${var.kubernetes_version}"
  resource_group_name = "${azurerm_resource_group.aks_resource_group.name}"

  linux_profile {
    admin_username = "xxx"

    ssh_key {
      key_data = "${var.ssh_public_key}"
    }
  }

  agent_pool_profile {
    count = "${var.node_count}"

    name            = "agentpool"
    vm_size         = "${var.vm_size}"
    os_disk_size_gb = "${var.os_disk_size}"
    os_type         = "Linux"
    vnet_subnet_id  = "${azurerm_subnet.private.id}"
    max_pods        = 110
  }

  service_principal {
    client_id     = "${azurerm_azuread_service_principal.service_principal.application_id}"
    client_secret = "${random_string.service_principal_password.result}"
  }

  role_based_access_control {
    enabled = true

    azure_active_directory {
      client_app_id     = "${var.rbac_client_app_id}"
      server_app_id     = "${var.rbac_server_app_id}"
      server_app_secret = "${var.rbac_server_app_secret}"
    }
  }

  network_profile {
    network_plugin = "azure"
  }

  depends_on = [
    "azurerm_azuread_service_principal.service_principal",
    "azurerm_azuread_service_principal_password.password",
  ]

  tags {
    environment = "${var.environment}"
    name        = "${var.name}"
  }
}

Debug Output

Unfortunately this happens intermittently, so I haven’t been able to get debug output It started happening after I upgraded to AzureRM Provider v1.20, but I’m not sure if there is a connection.

Expected Behavior

Running terraform destroy should successfully delete the terraform provisioned AKS cluster on the first attempt.

Actual Behavior

Running terraform destroy does not always successfully delete the terraform provisioned AKS cluster on the first attempt. It always succeeds on a second attempt.

The error produced:

Error: Error applying plan:

1 error(s) occurred:

* module.aks_cluster.azurerm_kubernetes_cluster.aks_cluster (destroy): 1 error(s) occurred:

* azurerm_kubernetes_cluster.aks_cluster: Error waiting for the deletion of Managed Kubernetes Cluster "test-westus2" (Resource Group "aks-rg-test-westus2"): azure.BearerAuthorizer#WithAuthorization: 
Failed to refresh the Token for request to https://management.azure.com/subscriptions/<subscription_id>providers/Microsoft.ContainerService/locations/westus2/operations/<id>?api-version=2016-03-30: StatusCode=0 -- 
Original Error: Manually created ServicePrincipalToken does not contain secret material to retrieve a new access token

Steps to Reproduce

This unfortunately happens intermittently. But running a terraform destroy on an AKS cluster sometimes results in the error above.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 66
  • Comments: 20 (5 by maintainers)

Most upvoted comments

@amasover yeah, we’ve a PR ready to go into the base library to fix this, it’s just waiting on a release of go-autorest which looks like it’s happening soon-ish 😃

I opened a PR to fix this a while back: https://github.com/hashicorp/go-azure-helpers/pull/39 Hopefully, it will get some attention soon.

Update: the PR is closed by maintainers without merging.

@ToruMakabe thanks for confirming that. Since this appears to be an issue in the upstream library I’ve created an upstream issue for this: https://github.com/hashicorp/go-azure-helpers/issues/22

As a workaround, if you’re using az login and your individual account, this doesn’t happen with “az login --use-device-code”.

It looks like https://github.com/Azure/go-autorest/pull/476 was just recently merged in, so once it gets incorporated downstream this issue should be fixed.

@tombuildsstuff late reply, but i also use the Azure CLI

(revisiting this issue because I’m still running into this)

The same error happened besides AKS cluster creation/deletion. It seems that the error occurs in long-running plan/apply. The following is an example of it during Resource Group deletion at the end of long-running apply.

Error: Error applying plan:

1 error(s) occurred:

* azurerm_resource_group.shared (destroy): 1 error(s) occurred:

* azurerm_resource_group.shared: Error deleting Resource Group "myrg": azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to
https://management.azure.com/subscriptions/myid/operationresults/myresult?api-version=2018-05-01: StatusCode=0 -- Original Error: Manually created ServicePrincipalToken does not contain secret material to retrieve a new access token

@katbyte Do you have any advice?