pulumi-aws: New AWS regions like ap-east-1 cause aws.provider errors

What happened?

We have run into what seems like a showstopper of a problem, at least for us. New regions are being provisioned with session tokens v2 and this breaks our aws.Provider() with the following errors:

    Error: failed to refresh cached credentials, operation error STS: AssumeRole, failed to sign request: failed to retrieve credentials:
        raise invoke_error
    Exception: invoke of aws:index/getCallerIdentity:getCallerIdentity failed: invocation of aws:index/getCallerIdentity:getCallerIdentity returned an error: 1 error occurred:
    	* error configuring Terraform AWS Provider: no valid credential sources for Terraform AWS Provider found.

I know there is a newer aws_native library which may fix this, we’re not using that. (and resources don’t seem to be cross compatible with the native provider)

Steps to reproduce

Perform any call against a region endpoint using tokens v2 like ap-east-1 using a provider with a profile using sts:assumerole

_provider = aws.Provider("Provider", region=region, profile="profile_name")

Expected Behavior

It works

Actual Behavior

    Error: failed to refresh cached credentials, operation error STS: AssumeRole, failed to sign request: failed to retrieve credentials:
        raise invoke_error
    Exception: invoke of aws:index/getCallerIdentity:getCallerIdentity failed: invocation of aws:index/getCallerIdentity:getCallerIdentity returned an error: 1 error occurred:
    	* error configuring Terraform AWS Provider: no valid credential sources for Terraform AWS Provider found.

Output of pulumi about

No response

Additional context

Went thru botocore code, and this is where the SDK decides when to to use the regional or global STS endpoints: https://github.com/boto/botocore/blob/dbc23f090d1257095da8bade8cb3fd5eeaec31db/botocore/args.py#L385-L388

Contributing

Vote on this issue by adding a 👍 reaction. To contribute a fix for this issue, leave a comment (and link to your pull request, if you’ve opened one already).

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 5
  • Comments: 18 (9 by maintainers)

Most upvoted comments

@thomas11: I’m wondering if this problem is only manifesting when using an assume-role profile, instead of direct pulumi configs?

In our case, with ap-east-1 enabled, we get the error reported by @rdanno when using a profile in ~/.aws/config similar to the one below and referencing the profile name in the pulumi code:

[profile prod]
role_arn = arn:aws:iam::123456789:role/some-role
source_profile = default
external_id = blablabla
role_session_name = my-session

As mentioned, the error happens in ap-east-1 which uses a V2 STS endpoint, but it doesn’t happen in regions where V1 is used (global STS endpoint).

The same error happens in much older versions of awscli/botocore. Newer versions of the awscli handle V2 regional STS endpoints transparently. My guess is that the terraform provider used by pulumi is using an older version of the AWS SDK that doesn’t handle the selection of regional endpoints transparently, as the newer versions of botocore do. But this is just guess 🤷‍♂️

Hi @rdanno, thanks for the issue. I’ve raised this with the team to appropriately prioritize. You may be correct that it relates to #2188, so I’ll make sure to discuss that too. Thanks!

Update: So we have found that enabling the same region on the source account (the account with the user which is assuming the role) fixes this issue in aws-cli and pulumi and presumably everything else.

We noticed that assumerole calls are made to both accounts to the regional sts endpoint. It makes sense that if one of these is unavailable the process would fail. Contrary to what the documentation reads… we are following up with AWS for clarification.

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_enable-regions.html The documentation there seems to indicate the user in account A does not need the regional endpoint enabled in their account.

That’s a great find, @rdanno! Thanks for updating this issue.

CLI
Version      3.44.1
Go Version   go1.19.2
Go Compiler  gc

Plugins
NAME     VERSION
aws      5.19.0
command  0.5.2
python   unknown

Host
OS       darwin
Version  11.7
Arch     x86_64

This project is written in python: executable='/usr/local/bin/python3' version='3.9.13
'

Dependencies:
NAME            VERSION
ansible         6.5.0
pip             22.1.1
pulumi-aws      5.19.0
pulumi-command  0.5.2
pygount         1.4.0
setuptools      62.3.2