aws-cdk: (core): cdk diff fails with Need to perform AWS calls for account XXX, but the current credentials are for YYY

Describe the bug

When running cdk diff on a project with stack that belong to multiple AWS accounts (bootstrapped so that a IAM role is assumed by CDK), the following error is reported:

[100%] fail: Need to perform AWS calls for account XXX, but the current credentials are for YYY
Failed to create change set with error: 'Failed to publish one or more assets. See the error messages above for more information.', falling back to no change-set diff

This only happens since version 119 (120 is affected too).

Running cdk deploy for the same stack works correctly.

Expected Behavior

cdk diff should work correctly as before

Current Behavior

cdk diff fails

Reproduction Steps

will try and provide an isolated example later, but this only happens for two stacks (out of ~10 identical ones across different accounts).

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.120.0 (build 58b90c4)

Framework Version

No response

Node.js Version

v20.10.0

OS

Linux

Language

TypeScript

Language Version

No response

Other information

No response

About this issue

  • Original URL
  • State: open
  • Created 6 months ago
  • Reactions: 12
  • Comments: 22 (11 by maintainers)

Most upvoted comments

Correct. If this is indeed expected behaviour, feel free to close the issue.

Thank you. We are going to address the messaging and look into it further. I wanted to make sure this receives the right priority. For example if the command would have failed completely, we would have reverted or released a hot fix.

Definitely keeping this open for further investigation.

I believe I have some insight here.

We are experiencing the same issue, but we only get the error when the stack in question is larger than 50KiB, so I believe the issue is that the code which uploads the template to S3 is not respecting the need to assume a role in the target account.

As the diff could not create a change set, it then bases the diff on template differences, which is not desirable. I would therefore consider this a bug, rather than just an issue with messaging.

Yep. If you are specifically wanting the change-set diff and it’s failing, even with the fall back to classic diff succeeding, definitely a bug. And I think it’s likely your correct on your assessment of what’s going on here.

It looks like the diff proceeds, the messaging is just unnecessarily scary. We will tone down the error messaging

@pahud @mrgrain - as @scarytom mentioned, this behavior appears to be a bug in the diff being able to properly assume the trust role required to download/upload the diff to the metadata directory in the target account and so the CDK is falling back to doing a diff with disk cache locally or the like; this is breaking behavior in an automated environment where CDKPipelines has a single deployment pipeline account that uses a trust role to deploy resources to each target account in the pipeline.

I’ve just upgraded our pipelines to 2.136.0 from 2.88.0 and prior to the upgrade I was not experiencing the error locally.

If I locally assume the creds of my pipelines account to run a diff AS the pipelines account against a target account locally on my machine (something I do all the time) using the verbose command, the diff is failing to assume the trust role in the target account and gives the same error message as above but the diff succeeds using disk cache.

If I then run a deploy to the target account of some change, again from my local machine AS the pipelines account, using those same creds, it succeeds (like it did before), but then if I let the change roll through my CDKPipeline in CodeBuild, the pipeline build is not detecting the changes that I deployed locally (AS the pipelines account) because the metadata in the target account from my previous deploy is not getting updated due to this failure.

Previously, in 2.88.0, the local CDK diff & deploy against the target account AS the pipelines account would result in a no-op once the deploy reached that stage in the pipeline because it would see the templates are already the same and have been updated.

To summarize:

  • Pipeline Account -> trust with many target accounts, still working
  • Pipeline Account runs CDKPipeline and deploys to all target accounts via this trust, still working
  • Assume Pipeline Account creds locally to run a diff AS the Pipeline Account against target CF stack, fails
  • Assume Pipeline Account creds locally to run a deploy AS the Pipeline Account against target CF stack, still working
  • Allow the already deployed change to flow through the pipeline, fails -> it tries to redeploy the same change that I deployed locally because the metadata wasn’t correctly updated during the diff & deploy I did locally

Because the last one fails, even if I can run a successful deploy locally, the next time a deploy rolls through the pipeline, the metadata hasn’t reflected my local deployment and the pipeline tries to duplicate the changes.

This is blocking for us and I’ve rolled our infra pkg back to the previous rev but this appears to be the behavior I’m seeing.

Here is the error we get, with trace debug on, if that helps.

[11:02:53] Storing template in S3 at: https://cdk-demo-infra-assets-XXX-us-east-1.s3.us-east-1.amazonaws.com/cdk/DemoStack/12111f5bf4b71f77e545882f66beabc16874487602d46cf99a272a01fbc58657.yml
[11:02:53] [0%] start: Publishing 12111f5bf4b71f77e545882f66beabc16874487602d46cf99a272a01fbc58657:current
[11:02:53] [trace] SdkProvider#forEnvironment()
[11:02:53] [trace]   SdkProvider#resolveEnvironment()
[11:02:53] [trace]   SdkProvider#obtainBaseCredentials()
[11:02:53] [trace]     SdkProvider#defaultAccount()
[11:02:53] [trace]     SdkProvider#defaultCredentials()
[100%] fail: Need to perform AWS calls for account XXX, but the current credentials are for YYY
[11:02:53] Failed to publish one or more assets. See the error messages above for more information.
Could not create a change set, will base the diff on template differences (run again with -v to see the reason)```