source-controller: Helm OCI repository - Failing to get credential from azure
Hello, I have created a HelmRepository of type OCI like this:
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
name: myhelmrepo
spec:
type: oci
provider: azure
interval: 10m
url: oci://myhelmrepo.azurecr.io/helm
timeout: 60s
I get the following error:
$ kubectl -n flux get helmrepository myhelmrepo
myhelmrepo oci://myhelmrepo.azurecr.io/helm 17h False failed to get credential from azure: DefaultAzureCredential: failed to acquire a token....
My kubernetes cluster underneath is an AKS cluster, and the managed Identity assigned to the kubelet does have access to the whole resource group where my registry is stored. There are other container registries in this resource group with standard docker images, and the cluster is able to pull images just fine.
Am I missing something ?
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15 (7 by maintainers)
Hi all, @makkes @souleb @masterphenix
I know this issue is already closed but we are currently running into the same issues mentioned and thought it might be helpful to share my findings.
The root cause of this should be the
Multiple user assigned identities exist, please specify the clientIdmentioned in:With System-assigned Managed Identity you can only have one identity. With User-assigned Managed Identity (UAI) up to 20 are possible. In our case 6 UAIs (node pool identity, Azure Key vault integration, Azure Policy integration, Azure Monitor integration, AKS GitOps extension (based on Flux) …) are attached to the AKS nodes.
In this case, you will have to tell Azure which one to use.
The .Net SDK provides a bit more details on that one (it’s not mentioned in the Go SDK): https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/identity/Azure.Identity/README.md#specify-a-user-assigned-managed-identity-with-defaultazurecredential
In my opinion, this could be done with the following options:
AZURE_CLIENT_ID. More details: https://github.com/Azure/azure-sdk-for-go/tree/main/sdk/azidentity#specify-a-user-assigned-managed-identity-for-defaultazurecredential/etc/kubernetes/azure.json(security wise this might not be the best idea)Based on my current understanding this should fix the issue.
Short follow-up on Azure AAD Pod Identity: This is deprecated. The successor is Azure AD Workload Identity (https://azure.github.io/azure-workload-identity/docs/)
Happy to discuss this further.
Update: I did a quick POC to verify this and I was able to get it working by adding the above environment variable to the source controller deployment.
Thank you kindly for your investigations, it does allow to narrow the issue. Following the mitigation provided, I executed this on the node:
Response suggests that the issue is linked to the fact that the node has several UAI assigned to it:
I thought it was due to the aci_connector that was active on this cluster, and which has its own UAI, but I have disabled it, and I still have the same error. Below is the aci_connector terraform config that we use, by the way:
Now, we also have AAD Pod Identity deployed, for other workloads, so it seems that when AAD Pod Identity is there, Kubelet Identity cannot be used by default, because the call to “DefaultAzureCredential” does not defaults to Kubelet Identity first, I will try using AAD Pod Identity, but I was hoping to be able to avoid that and just use the Kubelet ID instead.