aws-cdk: Unable to use ECS Ec2Service construct when ECS cluster uses manually added EC2 capacity

The Ec2Service class in the @aws-cdk/aws-ecs module has a check in its validate() method to verify that the cluster has capacity. While this seems like a reasonable check in many situations it assume that the user is using AWS CDK constructs to create ECS EC2 instances. However, if one where to manually create an ECS instance in a cluster that doesn’t use EC2 capacity this check would prevent the user from using the Ec2Service construct and there is no way around it (except for the ugly workaround mentioned below).

An example use-case is an ECS cluster that is 100% Fargate (i.e. no EC2 capacity required). Then a need arises when you need a very specific use-case that Fargate does not support and thus require an EC2 ECS instance. Further, the EC2 instance is heavily customized in a way that is not supported from cluster.addCapacity() and requires the ECS instance to be created outside of addCapacty(). In this situation, the ECS cluster construct does not know the capacity you’ve manually added.

My recommendation is to remove this hasCapacity check in Ec2Service as I don’t think it really services a useful purpose. At best it can catch a situation where someone accidentally forgot to add capacity but it cannot actually usefully tell you if you have enough capacity to actually run your service. What if your service requires 4xCPU but you only have enough capacity for 2xCPU. There is a myriad of cases where you can have enough capacity but ECS still won’t be able to run your service (another example would be placement constraints).

Reproduction Steps

  1. Create an ECS Cluster with a single Fargate service.
  2. Create an ECS instance manually (not using addCapacity())
  3. Create an ECS service using the Ec2Service construct

Error Log

Error: Validation failed with the following errors: [path/to/myservice] Cluster for this service needs Ec2 capacity. Call addXxxCapacity() on the cluster.

Environment

  • CLI Version :
  • Framework Version:
  • OS :
  • Language :

Other


This is 🐛 Bug Report

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 16 (5 by maintainers)

Most upvoted comments

I have another use case for bypassing the capacity check of the Ec2Service construct (and by extension the ECS EC2 patterns). We’re using Spot Fleet and Application AutoScaling Policies to handle instance capacity, so we’re not using AutoScaling groups. It doesn’t appear that addCapacity() supports this method of capacity management, so I’m unable to use our cluster with the Ec2Service construct. I get the same error that @cbeattie-tl mentioned in his issue: Cluster for this service needs Ec2 capacity. Call addXxxCapacity() on the cluster.

To all who may follow, I think @RodrigoAS28’s solution here is the best possible work-around since the gate keepers don’t seem to agree that there are valid use-cases to attach EC2 instances an ECS cluster where said EC2 instances cannot be created by ASGs (for any number of reasons).

I had a similar issue. In my case the EC2 instances were actually created via CDK, but since they are not in an ASG that can be added as capacity, I would get the same error. Found a workaround though: Just created the capacity 🙂

  const cluster = new ecs.Cluster(this, 'cluster', {
    clusterName: 'cluster',
    vpc: vpc,
    containerInsights: true,
    capacity: {
      autoScalingGroupName: "dummy",
      instanceType: new ec2.InstanceType("t2.nano"),
      maxCapacity: 0,
      minCapacity: 0,
    }
  })

Not beautiful, but it worked for me. CDK passed and the service runs on the registered EC2 instances.

Alternative workaround:

const cluster = new (class extends Cluster { get hasEc2Capacity(): boolean { return true; } })(this, "Cluster", { vpc, });

Hey @cbeattie-tl

Thanks for the response. After some discussions, we deem the use case (i.e. support adding manually-provisioned instance) is not one of the intended use cases for CDK, which meant to create reproducible infrastructure. Please let us know if you have other questions.

@hencrice What you are really saying is that AWS CDK will not support the use of the L2 Ec2Service construct if the ECS cluster is created in CDK but the “capacity” is provided outside of CDK (outside of the same stack actually). I believe this unnecessarily limiting.

not one of the intended use cases for CDK, which meant to create reproducible infrastructure

You’ll get no disagreement from me. The whole notion of this use case not being about reproducible infrastructure is moot. This use-case I am presenting here has nothing to do with infrastructure being “not reproducible”. In fact, in my scenario the EC2 instance I created was actually created within CDK itself!!! Its all CDK - it’s all CloudFormation. The situation at hand arises when you need to create EC2 VMs that cannot be created as part of an auto-scaling group. In my use-case, I needed to get the IP addresses of 3 nodes that form a quorum for bootstrapping membership. I needed to get the actual IP addresses of these nodes but I could no pre-assign them because I am running more than one “environment” in the same VPC so the IP’s may conflict. I need “random” IPs but I need the private IP address to bootstrap into other CDK assets. I’m thrilled if you can indicate a way of doing this an autoscaling group but without writing a bunch of custom resources an easier approach was to just create 3 EC2 instances “manually” (in my CDK stack). Without addressing this defect (as I see it), we are unable to use the Ec2Service construct and I had to write a very hacky work-around to overcome this short-coming.

By way of recommendation, perhaps CDK could support a way of addition “capacity” to an ECS cluster that is provided by creating EC2 instances instead of auto-scaling groups. For example, what if ecs.Cluster supported a method like cluster.addEc2Instance() (in contrast to cluster.addAutoScalingGroup()).

Not attempting to belabor the point, but the whole rationale behind this hasCapacity validation is misguided. You might “have capacity” in that you created an autoscaling group but you may be over subscribed to it and your services won’t fully start. This check doesn’t really prevent you from anything other than maybe accidentally forgetting to add some capacity to your cluster. Another situation is that you “have capacity” but you have a placement constraint that prevents your services from running on those nodes. At best it should be a warning.

I thought I’d just drop this here in case anyone is researching on how to provide capacity for ECS Clusters.

This statement is inaccurate:

Regarding 2, ECS Clusters only support adding capacity via autoscaling groups. So the recommended approach for that use -case would be to create and autoscaling group containing just the EC2 instance in question, then use the group as the capacity.

As pointed out in other comments above, it’s true that CDK doesn’t allow you to inform / provision other capacity sources for ECS Clusters, but ECS does support adding capacity to clusters without using autoscaling groups, as officially documented here. And there are several use cases for it (as seen in the comments).

Hopefully this helps convincing the team to move towards this direction.

@TheRealAmazonKendra