aws-cdk: (aws-s3): bucket policy fails to create when bucket:arn is not yet available
Describe the bug
A dependency issue between S3 Buckets and Bucket Policies in the L2 Bucket class allows the Policy to access the arn of the bucket before it is available, causing the creation of the Bucket Policy to fail. Being a dependency issue, this is an intermittent issue and works correctly the vast majority of the time. When it fails, simply relaunching the stack usually works.
Expected Behavior
The L2 Bucket construct should launch successfully every time.
Current Behavior
testPolicy9D625504
CREATE_FAILED
Unable to retrieve Arn attribute for AWS::S3::Bucket, with error message Bucket not found
Reproduction Steps
I created a simple CDK app with this code:
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';
export class BucketPolicyDependencyStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
new s3.Bucket(this, 'test', {
removalPolicy: cdk.RemovalPolicy.DESTROY,
autoDeleteObjects: true
})
}
}
I then set up a bash script that launched it 40 times, essentially simultaneously:
export constructs="
// Put any 30 values here, I just used 30 integers
"
for iteration in $constructs; do
export STACK_NAME=stresstest$iteration
cdk deploy -o stress$iteration --require-approval never &
done
On 1 of the 30 I saw the error I reference above.
Possible Solution
If I am interpreting the behavior correctly, it seems that adding a Dependency on the Bucket to the BucketPolicy in the L2 Construct would prevent the Policy from trying to access the bucket before it is ready. Perhaps here? https://github.com/aws/aws-cdk/blob/3318a38a6092275d461ef3549f3b92cd0d040c18/packages/aws-cdk-lib/aws-s3/lib/bucket.ts#L651
Additional Information/Context
We’ve seen it in several of our constructs (and newer versions of the CDK than what I cite below for the test above). Someone also mentioned they have seen it in aws-codepipline.
CDK CLI Version
2.108.0
Framework Version
2.108.0
Node.js Version
20.9.0
OS
MacOS Ventura 13.6.3
Language
TypeScript
Language Version
Typescript 5.2.2
Other information
Versions cited are for the test I cited, but it’s been seen in other versions as well.
About this issue
- Original URL
- State: open
- Created 6 months ago
- Reactions: 8
- Comments: 18 (6 by maintainers)
I am seeing this issue myself quite frequently. As with everyone else who have commented, this is a new behavior that was not occurring before.
I am using the CDK BucketDeployment, which automatically generates a parallel construct containing a lambda function, IAM role and policy. It is the policy that is trying to reference the arn of the bucket with Fn::GetAtt in the synthesized output. This seems to be failing about 50% if the time. I can cope with this by retrying the stack creation and cloudformation will simply start where it left off and complete the rest of the way.
biffgaut, can you reference where you found the AWS issue being reported? This is something I would want to monitor (and possibly bug them about - it’s a pain).
Thanks.
I am also facing the same problem. It is really annoying as it is hampering deployments. Has anyone figured out a workaround?
As an FYI this has happened ~60 times in the last 60 days so @biffgaut you’re not alone here.
We are also running into this issue with lambda function roles, I suspect it’s not* isolated to bucket policies.
That message was from an internal ticket here at AWS - there isn’t any further info available at the moment. I have not seen this issue referenced online anywhere but here, which is shocking to me as it has occurred on several workloads managed by our team so I would assume the impact is bigger than the few people monitoring this issue.
This is confirmed to be a CloudFormation issue. The word from AWS is:
Due to a recent change in internal workflow of CloudFormation, our development teams have identified an issue that can cause this error intermittently. They are currently working on deploying a fix for the same.
So it seems that there’s no change to CDK needed, that for the moment we just retry after a failure and it clear up entirely - hopefully soon.
Hi so if you’re running into this issue running a static site out of an s3 bucket via cloudfront you can split the code into 2 stacks for a more reliable CI/CD process.
Bucket Stack:
Distro Stack (with domain stuff):
Also, pay me.
I opened a support ticket with the AWS cloudformation team. They repeated to me the same thing they did to biffgaut. They did say this was a high priority issue, so I’d like to think the resolution is imminent. Support tickets are not allowed to be left open for more than 10 days for known bugs, but the AWS support rep did tell me that I could contact my organizations AWS account rep to ping me when the bug is fixed, or possibly the ticket might remain open until the fix is in because I asked for it to be. In any event, it looks like I will get notified somehow. When I do, I’ll update this issue.
i am having the same issue with just creating a bucket with an access policy as well.
Unable to retrieve Arn attribute for AWS::S3::Bucket, with error message Bucket not found
Talking to some coworkers, our theory is that the issue is not CDK per se - that a change in CloudFormation led to CloudFormation ceasing to recognize the dependency of the policy on the bucket from the context of the template (I’m running my tests using the generated template rather than the CDK program to confirm this).
If this is the case, then the issue is not necessarily within the CDK - but an update to the S3 Bucket construct to explicitly set the dependency would smooth over the CFN issue.
I am also facing a similar issue. Seems to be happen intermittently and started becoming an issue just before Christmas. Note the buckets (and stacks they are in) haven’t been changed for a few months, so seems like a fairly new problem.
I am facing the exact issue as well. It seems that cloudformation tries to create the bucket policy before the bucket creation is complete. Its inconsistent but saw it a few times in the last 2-3 weeks.