aws-sdk-go: The behavior of validateCredentialType is too strict and doesn't align with the AWS CLI or other SDKs

Describe the bug validateCredentialType() does not allow more than one credential provider configuration in a profile. This does not align with the AWS CLI and other SDKs, which have a hierarchy of credential providers and do not continue to check against providers later in the chain if one is satisfied.

This is an issue because not all AWS SDKs support AWS SSO configuration, so being able to provide a credential process backup (using a tool like aws-sso-util) allows one profile to work both with SDKs that support AWS SSO and those that don’t, transparently (because the SSO provider is higher up in the chain).

Version of AWS SDK for Go? v1.37

To Reproduce (observed behavior) Create a profile in ~/.aws/config that looks like

[profile my-profile]
region = us-east-2
sso_start_url = https://foo.awsapps.com/start
sso_region = us-east-2
sso_account_id = 123456789012
sso_role_name = MyRole

credential_process = aws-sso-util credential-process --profile my-profile

Expected behavior When using the Go SDK with my-profile, the AWS SSO configuration should be used and no error should occur.

Supposing aws-sso-util is installed: When using the appropriate version of the AWS CLI v2, boto3, or Go SDK, the AWS SSO configuration should be used and no error should occur. When using the JavaScript SDK for Node, or an older version of AWS CLI v1, boto3 or the Go SDK, the AWS SSO configuration will not be recognized, and the credential process will be used, which supports getting AWS SSO credentials using the profile’s configuration.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 50
  • Comments: 18 (3 by maintainers)

Commits related to this issue

Most upvoted comments

@skmcgrail If you think it’s problematic to be permissive with multiple credential configs in a single profile, the proper way to approach the problem is to go argue with the other SDK teams and come to a consensus, rather than waging a language-specific campaign that forces inconsistent behavior on customers (who often use more than one language, especially with Go as the CLI is in Python).

You also don’t raise an error if I specify the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY and also AWS_PROFILE (or any other multiple configuration in envConfig), which is the same sort of conflict that you describe above. Why not?

The expectation I have for credential loading is that there is a defined hierarchy, which is searched and as soon as valid credentials are found, no further searching occurs.

This has propagated a bug to Terraform latest AWS provider, and hence our setup that used SSO AND crediental_process for backward compatibility is now failing. I agree with @benkehoe that the credential loading hierarchy should be followed without complaining that there are crediental providers.

@JoshiiSinfield Terraform is one of the things that broke. But still, this change breaks anything that used aws-go-sdk and its user is using SSO + credential_process.

Having credential_process is very important for me because so many tools are not yet supporting AWS SSO and it adds a compatibility layer for me. That said, AWS guidelines do not prohibit you from using SSO and credential_process together and so should not the SDK.

I’m surprised a backwards incompatible change like this made it into a release, as AWS is generally well regarded for not breaking past promises. This has completely broken our terraform workflow where we use AWS SSO and aws-sso-util via credential_process.

While the AWS SDK Go team mulls this over, is there a workaround that doesn’t involve (1) pinning the aws provider version to be less than 3.26.0, (2) adding and maintaining double ~/.aws/config in all our developer and ci machines?

With all of that said, we’re investigating what SDK behavior change should be made. If the exclusive credential source check is removed, applications with invalid shared config profile credential sources that failed to load would no longer fail, and would have unexpected behavior using the competing credential sources.

This sounds like an attempt to protect people without having a --this-is-dangerous flag. It also makes the invalid assumption that having multiple credential providers is invalid.

Why is the Go SDK making decisions around how the SDK works independent of the way that the canonical (boto) SDK works?

hashicorp/terraform-provider-aws#17353 (comment) Also appears to point to a possible bug where if you use SSO configuration variables in any profile, it’ll now fail if you are also using credential_process or assume_role in any other profile.

I am not able to replicate this behavior that you are describing and we would need further details on the profile contents to reproduce this. If you have a configuration file you can provide us for this that would be helpful.

With regards the broader ask of this issue, the AWS SDK for Go has always been restrictive regarding the specification of credentials within the shared configuration files. The SDK has always validated that one type of credential be allowed to be provided per profile, and the introduction of SSO maintained that existing behavior expectation. While this behavior is more permissive in a some other SDK products, the prioritization and sourcing of credential types has been known to differ. Thus defining profiles to have multiple credential types creates a situation where multiple outcomes can occur:

  1. In the best possible scenario you source the “correct” credentials for your application and they work as expected
  2. You source the “wrong” credentials for your application, but they just work, either because the credentials contain a policy that is permissive and authorizes your application for the right API calls.
  3. You source credentials that don’t work and you get an error.

By modeling a profile to contain multiple credential types in its definition there is a level of entropy introduced in what the expected outcome is, and relies on a consistent level of behavior that has known to differ between the products. From a pure administrative perspective on credentials, it would be better to be explicit about where / what credentials you expect your application to use, or how it will get those credentials. In example number two, this situation could prove catastrophic if a credential is later deactivated or a policy change is introduced because the credential was modeled in a way where it is not explicitly clear the expected outcome.

I am also hitting this issue when using my favorite Go-based tools. My existing config works with the aws cli and other non-Go SDK programs; with a profile that contains both a source_profile and credential_process and I would like to use the same profile for all. As a workaround, I have to create two profiles which are compatible with Go-based and non Go-based utilities respectively.

The offending line is:

https://github.com/aws/aws-sdk-go/blob/master/aws/session/shared_config.go#L402

diff --git a/aws/session/shared_config.go b/aws/session/shared_config.go
index c3f38b6ec..b1e9f6b87 100644
--- a/aws/session/shared_config.go
+++ b/aws/session/shared_config.go
@@ -399,7 +399,7 @@ func (cfg *sharedConfig) validateCredentialType() error {
        if !oneOrNone(
                len(cfg.SourceProfileName) != 0,
                len(cfg.CredentialSource) != 0,
-               len(cfg.CredentialProcess) != 0,
+               // len(cfg.CredentialProcess) != 0,
                len(cfg.WebIdentityTokenFile) != 0,
                cfg.hasSSOConfiguration(),
        ) {

Our teams are also being impacted by this issue, primarily for terraform, but also with other Go-based utilities that we either develop or operate. I’m interested in having the issue resolved so that our teams can avoid jumping through hoops specifically when using go clients. Our teams either end up temporarily commenting/uncommenting the credential_process settings when switching between other go clients and other legacy clients, or utilize secondary profiles that only have a credential_process, which defers to the original profile for the sso settings. This is similar to the technique commented above (https://github.com/aws/aws-sdk-go/issues/3763#issuecomment-770707368,) but we are using benkehoe/aws-sso-util instead of 99designs/aws-vault.

If you’re using aws-vault, a possible workaround (solution?) seems to be to split the profile into two: one with SSO credentials and one with credential_process referring to the first one. Example:

[profile example-sso]
sso_start_url = https://d-123456abcd.awsapps.com/start
sso_region = eu-central-1
sso_account_id = 123456789012
sso_role_name = ExampleAdmin

[profile example]
credential_process = aws-vault exec example-sso --json

Still, we are operating a bunch of specialised AWS accounts and maintaining an AWS config with multiple profiles per account seems like a nightmare. While the point of ambiguous credentials is a fair one, the Golang SDK should be consistent with other AWS SDKs in this respect.

In this specific case if you have both SSO configured & a credential_process configured for a profile, it hard fails instead of using one or the other based on provider chain precedence.

This is fairly impactful, if other SDKs do not support natively pulling SSO credentials out of ~/.aws/config/sso/cache, one of the common workarounds was to use a credential_process to do that work for you. If you can’t have both configurations in a profile if you are using Go, it appears users now would need to:

  1. Know this detail
  2. Configure SDK specific profiles and use them in ~/.aws/config

https://github.com/hashicorp/terraform-provider-aws/issues/17353#issuecomment-770086126 Also appears to point to a possible bug where if you use SSO configuration variables in any profile, it’ll now fail if you are also using credential_process or assume_role in any other profile.

Further to my previous comment, I was short sighted (correctly reminded of this by @sherifabdlnaby ty…), and although terraform on it’s own works, we have alot of terragrunt, which doesn’t work, and now requires a duplicate separate profile for the remote state config that includes credential_process when running locally.

So we’re still impacted. It’s a PITA.

@KaibaLopez As this is assigned to yourself have you any updates? Can the tightening of validation be rolled back so that this SDK falls inline with the others?!

Cheers, Josh

Hey @benkehoe , I feel like I’m missing something with the explanation on how you expect this to behave or what the bug is… So to clarify, you’re saying that if you have a profile with SSO login in the config file and you try to use the SDK, it tries the sso profile, succeeds and still it tries the other providers, like a credential file or env variables. Or you are saying that it fails and doesn’t try the rest of the providerchain? Also kinda sounds like you expect to be able to use the same profile name on SSO login and the credentials file… is that the case? I’m not sure if that is the expected behavior across SDKs but it might be…I’d just have to check,but not sure if that’s part of what you’re saying.