azure-sdk-for-net: [BUG] Azure.Identity 1.5.0 freezes up ChainedTokenCredential with ManagedIdentityCredential listed first in local dev
Describe the bug
Immediately after upgrading Azure.Identity from 1.4.1 to 1.5.0, I noticed all my web projects freeze up at startup in local dev (VS Kestrel).
In my host builder inside Program.cs, I have
var tokenCred = new ChainedTokenCredential(new ManagedIdentityCredential(), new AzureCliCredential());
var secretClient = new SecretClient(
new Uri($"https://my-keyvault.vault.azure.net/"),
tokenCred);
var certificatesClient = new CertificateClient(
new Uri($"https://my-keyvault.vault.azure.net/"),
tokenCred);
config.AddAzureKeyVault(secretClient, new KeyVaultSecretManager());
...//load some necessary secret/certs etc
Note that I use ChainedTokenCredential with ManagedIdentityCredential listed first, followed by AzureCliCredential. This would ensure that when the project runs in Azure, managed identity is immediately used. In local dev, managed identity is attempted first which would fail quickly, then AzureCliCredential is successfully used next.
Expected behavior
Normally, the ManagedIdentityCredential should fail quickly (within a second or so) when running in local dev environment, which allows the chained credential to fall through to the next available credential.
Actual behavior
Something changed in Azure.Identity 1.5.0, which makes the program freeze up at ManagedIdentityCredential in local dev for a minute+. No exception/error messages (except Kestrel would time out, saying host is unable to start). But eventually, AzureCliCredential hits and code flows through. Maybe the timetout on ManagedIdentityCredential was misconfigured in the newer package.
Environment:
- Azure.Identity 1.5.0
- Visual Studio 2022 RC1
- ASP.NET Core Web API and Razor projects set to start up via Kestrel
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 22 (10 by maintainers)
Hi @christothes
I can provide my details.
My service is using
DefaultAzureCredentialwithout any other options and running the code on my laptop (local development), using VS 2022.I’ve done
az loginin the command line to cache my personal credentials to access Azure Resources. If I run it from the command line usingdotnet run(not using VS 2022 at all), I get the same behaviour.I observed the delay with connections to Azure Key Vault and Azure Service Bus.
Following is sample code connecting to KeyVault. You can see from the logs that it takes 10 seconds when
ManagedIdentityCredentialis not excluded vs 3 seconds whenManagedIdentityCredentialis excluded.Sample code 1
Logs (10 seconds to get the first key)::
Sample code 2 (
ExcludeManagedIdentityCredential = true)Logs (3 seconds to get first key):
Packages in my project (same behaviour with Azure.Identity 1.8.0-beta.1 and 1.9.0-beta.1):
System Info
Hi @heldersousa-planetpayment - I think the reason that the distinct
DefaultAzureCredentialworks relatively the same as the reused example is that, under the covers, when you don’t pass any options toDefaultAzureCredentialyou actually get a static singleton each time.@schaabs I just want to report my findings after implementing the changes suggested.
Even with the
NetworkTimeoutset to 1s, or simply usingDefaultAzureCredentialwhich has that by default, it still takes considerably longer than 1.4.1 to time outManagedIdentityCredential. I count approximately 10 seconds, long enough to trigger the “unable to connect to web server” warning in Visual Studio. So I assume that the 1s timeout only comes into play after some initial struggle which itself takes many seconds.As is right now with the package, I’ll have to revert back to 1.4.1. If this issue doesn’t eventually get resolved, I’m thinking I can probably check the environment (via something like
IWebHostEnvironment) and separate the credential per environment instead of chaining them like the old model. It can be a bit of a pain though because we use such credentials extensively in our app (we aim for 100% MI whenever possible).Hopefully y’all can come up with a resolution to restore the performance of the older package without sacrificing reliability.
By my naïve imagination, wouldn’t there be at least some kind of reliable “traits”/environment variables that can help decide “are we in Azure? (the only place MI is relevant)” before even attempting to call the MI endpoint (169.254.169.254?), and if not, skip the MI credential altogether?
@mikequ-taggysoft
No, a 1 second timeout should be safe since the managed identity endpoint is generally a non-addressable local endpoint. We did some extensive performance benchmarking on various managed identity hosts and found this to be an acceptable limit. What I was trying to suggest is that setting
NetworkTimeoutto a considerably shorter amount of time, say 300ms, will further speed up yourChainedTokenCredentialin your development enviornment, but depending on the host and your application you might see some artificial timeouts when deployed. Sorry for the confusion.You can also skip the VisualStudioCredential with DefaultAzureCredential by setting this to true: https://github.com/Azure/azure-sdk-for-net/blob/f993f9d5a6062a04feedd7a72ac71dcb0c7ce77f/sdk/identity/Azure.Identity/src/DefaultAzureCredentialOptions.cs#L98
There have been no changes in the SDK but perhaps there were some improvements shipped with VS’s token utility.
@mikequ-taggysoft sorry you’re running into this trouble. This delay was introduced by this refactoring to make the
ManagedIdentityCredentialendpoint discovery more reliable. Unfortunately, depending on your systems network configuration, failures may not be immediate, and fail due to timeout after the full duration ofRetryOptions.NetworkTimeoutwhich I believe defaults to 90 seconcds.After discovering this timeout behavior in some dev configurations we added this fix which limits the initial network connection timeout when the
ManagedIdentityCredentialis used in theDefaultAzureCredential. Unfortunately this only fixes the issue inManagedIdentityCredentialwhen used in theDefaultAzureCredentialchain, not when used with a custom chain with theChainedTokenCredentialas in your case.You should be able to work around this issue by configuring the
NetworkTimeouton theManagedIdentityCredentialto limit the time it will wait on a network response. I would start by configuring it to 1 second as we have done in theDefaultAzureCredential, but you can experiment to see what values work best for your development environment and deployed environment. Keep in mind setting too shortNetworkTimeoutdurations will lead to artificial network timeouts when deployed to a managed identity enabled host, causing theManagedIdentityCredentialto throw aCredentialUnavailableException. Below is an example of how you can update your code to configure theNetworkTimeout.We’re currently exploring options of how to fix this long delay in some development environments when using
ManagedIdentityCredentialinside aChainedTokenCredential, but hopefully this work-around will unblock you in the meantime. Please let me know if you have any trouble with the work-around. I’ll update this issue once we have a better idea of what the fix might be, and when it will be available.