aws-cdk: (aws-s3): bucket policy fails to create when bucket:arn is not yet available

Describe the bug

A dependency issue between S3 Buckets and Bucket Policies in the L2 Bucket class allows the Policy to access the arn of the bucket before it is available, causing the creation of the Bucket Policy to fail. Being a dependency issue, this is an intermittent issue and works correctly the vast majority of the time. When it fails, simply relaunching the stack usually works.

Expected Behavior

The L2 Bucket construct should launch successfully every time.

Current Behavior

testPolicy9D625504

CREATE_FAILED

Unable to retrieve Arn attribute for AWS::S3::Bucket, with error message Bucket not found

Reproduction Steps

I created a simple CDK app with this code:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';

export class BucketPolicyDependencyStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    new s3.Bucket(this, 'test', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    })
  }
}

I then set up a bash script that launched it 40 times, essentially simultaneously:

export constructs="
// Put any 30 values here, I just used 30 integers
"
for iteration in $constructs; do
  export STACK_NAME=stresstest$iteration
  cdk deploy -o stress$iteration --require-approval never &
done

On 1 of the 30 I saw the error I reference above.

Possible Solution

If I am interpreting the behavior correctly, it seems that adding a Dependency on the Bucket to the BucketPolicy in the L2 Construct would prevent the Policy from trying to access the bucket before it is ready. Perhaps here? https://github.com/aws/aws-cdk/blob/3318a38a6092275d461ef3549f3b92cd0d040c18/packages/aws-cdk-lib/aws-s3/lib/bucket.ts#L651

Additional Information/Context

We’ve seen it in several of our constructs (and newer versions of the CDK than what I cite below for the test above). Someone also mentioned they have seen it in aws-codepipline.

CDK CLI Version

2.108.0

Framework Version

2.108.0

Node.js Version

20.9.0

OS

MacOS Ventura 13.6.3

Language

TypeScript

Language Version

Typescript 5.2.2

Other information

Versions cited are for the test I cited, but it’s been seen in other versions as well.

About this issue

  • Original URL
  • State: open
  • Created 6 months ago
  • Reactions: 8
  • Comments: 18 (6 by maintainers)

Most upvoted comments

I am seeing this issue myself quite frequently. As with everyone else who have commented, this is a new behavior that was not occurring before.

I am using the CDK BucketDeployment, which automatically generates a parallel construct containing a lambda function, IAM role and policy. It is the policy that is trying to reference the arn of the bucket with Fn::GetAtt in the synthesized output. This seems to be failing about 50% if the time. I can cope with this by retrying the stack creation and cloudformation will simply start where it left off and complete the rest of the way.

biffgaut, can you reference where you found the AWS issue being reported? This is something I would want to monitor (and possibly bug them about - it’s a pain).

Thanks.

I am also facing the same problem. It is really annoying as it is hampering deployments. Has anyone figured out a workaround?

As an FYI this has happened ~60 times in the last 60 days so @biffgaut you’re not alone here.

We are also running into this issue with lambda function roles, I suspect it’s not* isolated to bucket policies.

That message was from an internal ticket here at AWS - there isn’t any further info available at the moment. I have not seen this issue referenced online anywhere but here, which is shocking to me as it has occurred on several workloads managed by our team so I would assume the impact is bigger than the few people monitoring this issue.

This is confirmed to be a CloudFormation issue. The word from AWS is:

Due to a recent change in internal workflow of CloudFormation, our development teams have identified an issue that can cause this error intermittently. They are currently working on deploying a fix for the same.

So it seems that there’s no change to CDK needed, that for the moment we just retry after a failure and it clear up entirely - hopefully soon.

Hi so if you’re running into this issue running a static site out of an s3 bucket via cloudfront you can split the code into 2 stacks for a more reliable CI/CD process.

Bucket Stack:

 /**
     * Content bucket
     */
    new s3.Bucket(this, 'SiteBucket', {
      bucketName: `${buildDomain(props.domainSegments)}`,
      websiteIndexDocument: 'index.html',
      websiteErrorDocument: 'index.html',
      // publicReadAccess: true,
      // autoDeleteObjects: true,
      // accessControl: BucketAccessControl.PUBLIC_READ,
      /**
       * The default removal policy is RETAIN, which means that cdk destroy will not attempt to delete
       * the new bucket, and it will remain in your account until manually deleted. By setting the policy to
       * DESTROY, cdk destroy will attempt to delete the bucket, but will error if the bucket is not empty.
       */
      // removalPolicy: cdk.RemovalPolicy.DESTROY, // NOT recommended for production code
    });

Distro Stack (with domain stuff):

/**
     * Hosted zone
     */
    const zone = route53.HostedZone.fromLookup(this, 'Zone', {
      domainName: props.domainSegments.domain,
    });
    new cdk.CfnOutput(this, 'URL', {
      value: `https://${util.buildDomain(props.domainSegments)}`,
    });

    /**
     * TLS certificate
     */
    const certificate = new acm.Certificate(this, 'Certificate', {
      domainName: `${util.buildDomain(props.domainSegments)}`,
      validation: acm.CertificateValidation.fromDns(zone),
    });

    new cdk.CfnOutput(this, 'CertificateOutput', {
      value: certificate.certificateArn,
    });

    const oai = new cloudfront.OriginAccessIdentity(this, 'OAI');
    const bucket = s3.Bucket.fromBucketName(
      this,
      'StaticSiteBucket',
      `${util.buildDomain(props.domainSegments)}`
    );

    bucket.grantPublicAccess();
    const bucketPolicy = new s3.BucketPolicy(this, 'BucketPolicy', {
      bucket,
    });

    // Grant public access through the bucket policy
    bucketPolicy.document.addStatements(
      new iam.PolicyStatement({
        actions: ['s3:GetObject'],
        resources: [bucket.arnForObjects('*')],
        principals: [
          new iam.CanonicalUserPrincipal(
            oai.cloudFrontOriginAccessIdentityS3CanonicalUserId
          ),
        ],
      })
    );
    new cdk.CfnOutput(this, 'SiteBucketOutput', { value: bucket.bucketName });

    /**
     * Cloudfront OAI
     */

    /**
     * CloudFront distribution that provides HTTPS
     */
    this.distribution = new cloudfront.Distribution(this, 'myDist', {
      defaultRootObject: 'index.html',
      minimumProtocolVersion: cloudfront.SecurityPolicyProtocol.TLS_V1_2_2021,
      defaultBehavior: {
        origin: new cloudfront_origins.S3Origin(bucket, {
          originAccessIdentity: oai,
        }),
        compress: true,
        allowedMethods: cloudfront.AllowedMethods.ALLOW_GET_HEAD_OPTIONS,
        viewerProtocolPolicy: cloudfront.ViewerProtocolPolicy.REDIRECT_TO_HTTPS,
      },
      errorResponses: [
        {
          httpStatus: 403,
          responseHttpStatus: 403,
          responsePagePath: '/index.html',
          ttl: cdk.Duration.minutes(30),
        },
      ],
      domainNames: [`${util.buildDomain(props.domainSegments)}`],
      certificate: certificate,
    });
    new cdk.CfnOutput(this, 'DistributionIdOutput', {
      value: this.distribution.distributionId,
    });

    /**
     * Route53 alias record for the CloudFront distribution
     */
    new route53.ARecord(this, 'SiteAliasRecordOutput', {
      recordName: `${util.buildDomain(props.domainSegments)}`,
      target: route53.RecordTarget.fromAlias(
        new route53_targets.CloudFrontTarget(this.distribution)
      ),
      zone,
    });

    /**
     * Build sources depending on if there are more things that need to be added
     * Take the strings in extraSources and map them to extra sources
     */
    const sources = props.extraSources
      ? [
          ...props.extraSources.map((path) => s3_deployment.Source.asset(path)),
          s3_deployment.Source.asset(props.pathToAssets),
        ]
      : [s3_deployment.Source.asset(props.pathToAssets)];

    /**
     * Automated s3 deployment
     */
    new s3_deployment.BucketDeployment(this, 'DeployWithInvalidation', {
      sources: [...sources],
      destinationBucket: bucket,
      distribution: this.distribution,
      distributionPaths: ['/*'],
    });

Also, pay me.

I opened a support ticket with the AWS cloudformation team. They repeated to me the same thing they did to biffgaut. They did say this was a high priority issue, so I’d like to think the resolution is imminent. Support tickets are not allowed to be left open for more than 10 days for known bugs, but the AWS support rep did tell me that I could contact my organizations AWS account rep to ping me when the bug is fixed, or possibly the ticket might remain open until the fix is in because I asked for it to be. In any event, it looks like I will get notified somehow. When I do, I’ll update this issue.

i am having the same issue with just creating a bucket with an access policy as well.

 const logBucket = new Bucket(
            this,
            ${config.kitName}-alb-logs-bucket,
            {
                blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
                removalPolicy:RemovalPolicy.DESTROY,
                autoDeleteObjects: true
            }

        )

Unable to retrieve Arn attribute for AWS::S3::Bucket, with error message Bucket not found

Talking to some coworkers, our theory is that the issue is not CDK per se - that a change in CloudFormation led to CloudFormation ceasing to recognize the dependency of the policy on the bucket from the context of the template (I’m running my tests using the generated template rather than the CDK program to confirm this).

If this is the case, then the issue is not necessarily within the CDK - but an update to the S3 Bucket construct to explicitly set the dependency would smooth over the CFN issue.

I am also facing a similar issue. Seems to be happen intermittently and started becoming an issue just before Christmas. Note the buckets (and stacks they are in) haven’t been changed for a few months, so seems like a fairly new problem.

I am facing the exact issue as well. It seems that cloudformation tries to create the bucket policy before the bucket creation is complete. Its inconsistent but saw it a few times in the last 2-3 weeks.