source-controller: Helm OCI repository - Failing to get credential from azure

Hello, I have created a HelmRepository of type OCI like this:

apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: myhelmrepo
spec:
  type: oci
  provider: azure
  interval: 10m
  url: oci://myhelmrepo.azurecr.io/helm
  timeout: 60s

I get the following error:

$ kubectl -n flux get helmrepository myhelmrepo
myhelmrepo            oci://myhelmrepo.azurecr.io/helm             17h    False   failed to get credential from azure: DefaultAzureCredential: failed to acquire a token....

My kubernetes cluster underneath is an AKS cluster, and the managed Identity assigned to the kubelet does have access to the whole resource group where my registry is stored. There are other container registries in this resource group with standard docker images, and the cluster is able to pull images just fine.

Am I missing something ?

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (7 by maintainers)

Most upvoted comments

Hi all, @makkes @souleb @masterphenix

I know this issue is already closed but we are currently running into the same issues mentioned and thought it might be helpful to share my findings.

The root cause of this should be the Multiple user assigned identities exist, please specify the clientId mentioned in:

{"error":"invalid_request","error_description":"Multiple user assigned identities exist, please specify the clientId / resourceId of the identity in the token request"}

With System-assigned Managed Identity you can only have one identity. With User-assigned Managed Identity (UAI) up to 20 are possible. In our case 6 UAIs (node pool identity, Azure Key vault integration, Azure Policy integration, Azure Monitor integration, AKS GitOps extension (based on Flux) …) are attached to the AKS nodes.

In this case, you will have to tell Azure which one to use.

The .Net SDK provides a bit more details on that one (it’s not mentioned in the Go SDK): https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/identity/Azure.Identity/README.md#specify-a-user-assigned-managed-identity-with-defaultazurecredential

In my opinion, this could be done with the following options:

Based on my current understanding this should fix the issue.

Short follow-up on Azure AAD Pod Identity: This is deprecated. The successor is Azure AD Workload Identity (https://azure.github.io/azure-workload-identity/docs/)

Happy to discuss this further.

Update: I did a quick POC to verify this and I was able to get it working by adding the above environment variable to the source controller deployment.

Thank you kindly for your investigations, it does allow to narrow the issue. Following the mitigation provided, I executed this on the node:

$ curl 'http://169.254.169.254/metadata/identity/oauth2/token?resource=https://management.core.windows.net&api-version=2018-02-01' -H "Metadata: true"

Response suggests that the issue is linked to the fact that the node has several UAI assigned to it:

{"error":"invalid_request","error_description":"Multiple user assigned identities exist, please specify the clientId / resourceId of the identity in the token request"}

I thought it was due to the aci_connector that was active on this cluster, and which has its own UAI, but I have disabled it, and I still have the same error. Below is the aci_connector terraform config that we use, by the way:

  addon_profile {
    aci_connector_linux {
     enabled = true
     subnet_name = var.subnet
    }
  }

Now, we also have AAD Pod Identity deployed, for other workloads, so it seems that when AAD Pod Identity is there, Kubelet Identity cannot be used by default, because the call to “DefaultAzureCredential” does not defaults to Kubelet Identity first, I will try using AAD Pod Identity, but I was hoping to be able to avoid that and just use the Kubelet ID instead.