aws-cdk: aws-ecs: Cannot deploy fargate services with ECR images

  • I’m submitting a …

    • 🪲 bug report
    • 🚀 feature request
    • 📚 construct library gap
    • ☎️ security issue or vulnerability => Please see policy
    • ❓ support request => Please see note at the top of this template.
  • What is the current behavior? If the current behavior is a 🪲bug🪲: Please provide the steps to reproduce

  1. Create a VPC with one public subnet. And make sure that Internet gateway is created.
  2. Create a Fargate cluster within the VPC.
  3. Try to create a LoadBalancedFargateService.

cdk deploy creates all constructs successfully, including ELB, SG, etc., but it fails to create the service, which is practically the final step:

Do you wish to deploy these changes (y/n)? y
Rbi5PatternsStagingStack: deploying...
Rbi5PatternsStagingStack: creating CloudFormation changeset...
  0/15 | 9:14:17 PM | CREATE_IN_PROGRESS   | AWS::Logs::LogGroup                       | website/TaskDef/web/LogGroup (websiteTaskDefwebLogGroupC8A8E4C5)
  0/15 | 9:14:17 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroup                   | website/Service/SecurityGroup (websiteServiceSecurityGroup997F4E46)
  0/15 | 9:14:17 PM | CREATE_IN_PROGRESS   | AWS::IAM::Role                            | website/TaskDef/TaskRole (websiteTaskDefTaskRoleC7AA0A74)
  0/15 | 9:14:17 PM | CREATE_IN_PROGRESS   | AWS::Logs::LogGroup                       | website/TaskDef/web/LogGroup (websiteTaskDefwebLogGroupC8A8E4C5) Resource creation Initiated
  1/15 | 9:14:17 PM | CREATE_COMPLETE      | AWS::Logs::LogGroup                       | website/TaskDef/web/LogGroup (websiteTaskDefwebLogGroupC8A8E4C5)
  1/15 | 9:14:17 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::TargetGroup  | website/LB/PublicListener/ECSGroup (websiteLBPublicListenerECSGroup247DABD4)
  1/15 | 9:14:18 PM | CREATE_IN_PROGRESS   | AWS::IAM::Role                            | website/TaskDef/TaskRole (websiteTaskDefTaskRoleC7AA0A74) Resource creation Initiated
  1/15 | 9:14:18 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroup                   | website/LB/SecurityGroup (websiteLBSecurityGroup73701CCA)
  1/15 | 9:14:18 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::TargetGroup  | website/LB/PublicListener/ECSGroup (websiteLBPublicListenerECSGroup247DABD4) Resource creation Initiated
  1/15 | 9:14:18 PM | CREATE_IN_PROGRESS   | AWS::ECS::Cluster                         | staging-cluster (stagingclusterDBEBD0C8)
  2/15 | 9:14:18 PM | CREATE_COMPLETE      | AWS::ElasticLoadBalancingV2::TargetGroup  | website/LB/PublicListener/ECSGroup (websiteLBPublicListenerECSGroup247DABD4)
  2/15 | 9:14:18 PM | CREATE_IN_PROGRESS   | AWS::ECS::Cluster                         | staging-cluster (stagingclusterDBEBD0C8) Resource creation Initiated
  3/15 | 9:14:19 PM | CREATE_COMPLETE      | AWS::ECS::Cluster                         | staging-cluster (stagingclusterDBEBD0C8)
  3/15 | 9:14:19 PM | CREATE_IN_PROGRESS   | AWS::CDK::Metadata                        | CDKMetadata
  3/15 | 9:14:21 PM | CREATE_IN_PROGRESS   | AWS::CDK::Metadata                        | CDKMetadata Resource creation Initiated
  4/15 | 9:14:21 PM | CREATE_COMPLETE      | AWS::CDK::Metadata                        | CDKMetadata
  4/15 | 9:14:22 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroup                   | website/Service/SecurityGroup (websiteServiceSecurityGroup997F4E46) Resource creation Initiated
  4/15 | 9:14:22 PM | CREATE_IN_PROGRESS   | AWS::IAM::Policy                          | website-execution-role/Policy (websiteexecutionrolePolicy67F459E7)
  4/15 | 9:14:22 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroup                   | website/LB/SecurityGroup (websiteLBSecurityGroup73701CCA) Resource creation Initiated
  5/15 | 9:14:23 PM | CREATE_COMPLETE      | AWS::EC2::SecurityGroup                   | website/Service/SecurityGroup (websiteServiceSecurityGroup997F4E46)
  6/15 | 9:14:23 PM | CREATE_COMPLETE      | AWS::EC2::SecurityGroup                   | website/LB/SecurityGroup (websiteLBSecurityGroup73701CCA)
  6/15 | 9:14:23 PM | CREATE_IN_PROGRESS   | AWS::IAM::Policy                          | website-execution-role/Policy (websiteexecutionrolePolicy67F459E7) Resource creation Initiated
  6/15 | 9:14:26 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::LoadBalancer | website/LB (websiteLB14D1FE30)
  6/15 | 9:14:27 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::LoadBalancer | website/LB (websiteLB14D1FE30) Resource creation Initiated
  6/15 | 9:14:27 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroupEgress             | website/LB/SecurityGroup/to Rbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA5:443 (websiteLBSecurityGrouptoRbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA544358F061E7)
  6/15 | 9:14:27 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroupIngress            | website/Service/SecurityGroup/from Rbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A:443 (websiteServiceSecurityGroupfromRbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A443278970E2)
  6/15 | 9:14:28 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroupEgress             | website/LB/SecurityGroup/to Rbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA5:443 (websiteLBSecurityGrouptoRbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA544358F061E7) Resource creation Initiated
  6/15 | 9:14:28 PM | CREATE_IN_PROGRESS   | AWS::EC2::SecurityGroupIngress            | website/Service/SecurityGroup/from Rbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A:443 (websiteServiceSecurityGroupfromRbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A443278970E2) Resource creation Initiated
  7/15 | 9:14:28 PM | CREATE_COMPLETE      | AWS::EC2::SecurityGroupIngress            | website/Service/SecurityGroup/from Rbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A:443 (websiteServiceSecurityGroupfromRbi5PatternsStagingStackwebsiteLBSecurityGroup1CC8D48A443278970E2)
  8/15 | 9:14:28 PM | CREATE_COMPLETE      | AWS::EC2::SecurityGroupEgress             | website/LB/SecurityGroup/to Rbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA5:443 (websiteLBSecurityGrouptoRbi5PatternsStagingStackwebsiteServiceSecurityGroupE0482DA544358F061E7)
  9/15 | 9:14:31 PM | CREATE_COMPLETE      | AWS::IAM::Policy                          | website-execution-role/Policy (websiteexecutionrolePolicy67F459E7)
 10/15 | 9:14:36 PM | CREATE_COMPLETE      | AWS::IAM::Role                            | website/TaskDef/TaskRole (websiteTaskDefTaskRoleC7AA0A74)
 10/15 | 9:14:41 PM | CREATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                  | website/TaskDef (websiteTaskDef81484BC5)
 10/15 | 9:14:41 PM | CREATE_IN_PROGRESS   | AWS::ECS::TaskDefinition                  | website/TaskDef (websiteTaskDef81484BC5) Resource creation Initiated
 11/15 | 9:14:42 PM | CREATE_COMPLETE      | AWS::ECS::TaskDefinition                  | website/TaskDef (websiteTaskDef81484BC5)
11/15 Currently in progress: websiteLB14D1FE30
 12/15 | 9:16:28 PM | CREATE_COMPLETE      | AWS::ElasticLoadBalancingV2::LoadBalancer | website/LB (websiteLB14D1FE30)
 12/15 | 9:16:32 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::Listener     | website/LB/PublicListener (websiteLBPublicListenerC5A4EA76)
 12/15 | 9:16:32 PM | CREATE_IN_PROGRESS   | AWS::ElasticLoadBalancingV2::Listener     | website/LB/PublicListener (websiteLBPublicListenerC5A4EA76) Resource creation Initiated
 13/15 | 9:16:32 PM | CREATE_COMPLETE      | AWS::ElasticLoadBalancingV2::Listener     | website/LB/PublicListener (websiteLBPublicListenerC5A4EA76)
 13/15 | 9:16:37 PM | CREATE_IN_PROGRESS   | AWS::ECS::Service                         | website/Service/Service (websiteService29B32E70)
 13/15 | 9:16:38 PM | CREATE_IN_PROGRESS   | AWS::ECS::Service                         | website/Service/Service (websiteService29B32E70) Resource creation Initiated
13/15 Currently in progress: websiteService29B32E70

It hangs at this point, and the deploy eventually times out.

  • What is the expected behavior (or behavior of feature suggested)?

The stack should be created successfully.

  • What is the motivation / use case for changing the behavior or adding this feature?

I’d like to use CDK to create a fargate service.

  • Please tell us about your environment:

    • CDK CLI Version: 1.3.0 (build bba9914)
    • Module Version: 1.3.0
    • OS: OSX Mojave
    • Language: TypeScript
    • Node: v10.15.3
    • Typescript: Version 3.5.3
  • Other information (e.g. detailed explanation, stacktraces, related issues, suggestions how to fix, links for us to have context, eg. associated pull-request, stackoverflow, gitter, etc)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 23 (5 by maintainers)

Most upvoted comments

I am launching a stack for the very first time, which includes a ECR and a Fargate service that refers to a repo in the ECR that does not (yet) have an image. The intent is to push images later using Ci/CD. But my stack hangs, presumably trying to deploy the docker image that does not yet exist!

    // Container image
    const zcloudRepository = new ecr.Repository( this, 'ZcloudRepository', {
      repositoryName: `zcloud-image-${props.env}`,
    });
    
    // Lifecycle
    zcloudRepository.addLifecycleRule({ maxImageAge: cdk.Duration.days(30) });  // delete older than 30 days

    // Grant push access to the deploy user
    zcloudRepository.grantPullPush(deployer);

    // The task definition
    const zcloudDefinition = new ecs.FargateTaskDefinition(this, 'ZcloudDefinition', {
      // CPU and MEMORY limits here need to be equal or greater than all container instances added up
      cpu: zCPU * 4,
      memoryLimitMiB: zMemory * 4,
    });

    const container = zcloudDefinition.addContainer('zcloud', {
      image: ecs.ContainerImage.fromEcrRepository(zcloudRepository),
      memoryMiB: zMemory,
      cpu: zCPU,
      environment: {
        NODE_ENV: props.env,   // set up NODE_ENV from the props passed in
      },
      logging: new ecs.AwsLogDriver({
        logGroup: props.global.paperwatchLogGroup,
        streamPrefix: `zcloud-${props.env}`
      })
    });
    container.addPortMappings({
      containerPort: 3000 
    });

Hey AWS team,

I’m also having this issue. How do you solve the failed service deploy when there’s no image during the first cdk deploy?

...VPC code above...

const repository = new ecr.Repository(this, 'app-ecr', {
  repositoryName: 'app-v2',
});

const cluster = new ecs.Cluster(this, 'app-cluster', {
  vpc,
  clusterName: 'app-v2',
});

const loadBalancedFargateService = new ecsPatterns.ApplicationLoadBalancedFargateService(this, 'app-service', {
  cluster,
  serviceName: 'app-v2',
  desiredCount: 1,
  cpu: 512,
  memoryLimitMiB: 1024,
  minHealthyPercent: 100,
  maxHealthyPercent: 200,
  publicLoadBalancer: true,
  taskImageOptions:{
    enableLogging: true,
    image: ecs.ContainerImage.fromEcrRepository(repository),
    containerPort: 3000,
  }
});

The above code hangs forever during cdk deploy. There’s no image in the ECR but that’s to be expected when you first spin up infra.

Thanks for the help @drakir, your workaround helped me get this spun up.

Is there any official solution to this problem @skinny85?

I looked at the thread you mentioned above but it didn’t seem like the same thing.

Let me know if this could use a new issue, thanks!

There is a solution, it’s not pretty, but what you can do is to find the low level construct CfnService and override the default behavior and set the desiredCount to 0. Then in your pipeline of your app you must remember to update the desiredCount after an image have been pushed to the repository. This can be done with the aws cli.

A little example:

const loadBalancedFargateService = new ApplicationLoadBalancedFargateService(this, 'MyFargateService', {
            cluster,
            memoryLimitMiB: 1024,
            cpu: 512,
            taskImageOptions: {
                image: ContainerImage.fromEcrRepository(repository) // no image exists here at the time of this execution
            },
            desiredCount: 1 //cannot be set to any lower count here
        });

        const node = loadBalancedFargateService.service.node;

        // fetches the underlying low level construct CfnService to override the desiredCount to 0, so the stack does not time out if no image is in place.
        const cfnService: CfnService = node.findChild('Service') as CfnService; //it just so happens to be named to Service if you check the source code.
        cfnService.desiredCount = 0;`

Got my solution, I hadn’t specified an image tag when I called

ecs.ContainerImage.fromEcrRepository

so it defaulted to “latest” - our repository didn’t have a latest tag so this failed.

Also, @realharry , if you dig into the Service and check under details, the rest of the STOPPED (CannotPullContainerError: Error response from daem) message becomes available. Mine was CannotPullContainerError: Error response from daemon: manifest for <URI to my image in ecr>:latest not found

I’m having a similar problem. Is there any way to define an ECR repository and Fargate service with taskImageOptions: { image: ecs.ContainerImage.fromEcrRepository(myRepo) } in the same stack?

@djkirby I ended up working around this by manually creating the ECR repo outside of CloudFormation.

I think fromEcrRepository would be a lot more useful if it was possible to define a repository and service in the same stack, using the repository as the parameter. Maybe it’d be worth opening a new issue based on the report from @peebles. Any thoughts, @NGL321?

@NGL321 Thanks for following up. I no longer work on the deployment task at this point (a colleague in the team took over the task), but my understanding was it was resolved following the direction similar to what was suggested by @ConradMearns. Thanks! I’ll close this ticket.

I ended up spending a little over a week (or, more like 2 weeks) with CDK, and I think I achieved quite a bit with a help of CDK (although I failed to build the whole stack (not a big architecture by any measure) using CDK within that 1~2 week time frame). My primary problem with CDK, not necessarily specific to this particular issue, was essentially my lack of knowledge across AWS service stacks. When the deployment failed, the error happened somewhere in AWS services, and CDK (being a high level tool) wasn’t entirely helpful in pinpointing what exactly when wrong. But, over time, with more experience, it got easier and easier. As a general comment, I don’t know how easy it would be, but if CDK could up-propagate the error messages to the user from the low-level stack where the error occurred, I think it would be very useful. Thanks!