aws-cdk: API Gateway: Too Many Requests on API creation
Hello,
When creating an API that contains a lot of endpoints, we reach the API Gateway limit on resource creation and get the error
Too Many Requests (Service: ApiGateway, Status Code: 429, ...
The limits : https://docs.amazonaws.cn/en_us/batch/latest/userguide/service_limits.html
Reproduction Steps
Create a REST API with a lot of resources.
What did you expect to happen?
I expected that CDK will consider this and have a ‘sleep’ between calls if necessary.
Right now I’m just commenting some of the nested stack that contains my ressources and unccoment them in batch.
Linked to this I think : https://github.com/aws-cloudformation/cloudformation-coverage-roadmap/issues/589
What actually happened?
Got the 429 error
Environment
- CDK CLI Version : 1.109.0 (build c647e38)
- Node.js Version: v12.18.1
This is 🐛 Bug Report
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 3
- Comments: 20 (7 by maintainers)
Hello @peterwoodworth @nija-at I’ve been struggling with this issue for quite some time and I’d like to re-open it.
I’ve created this: repository, that consistently replicates the issue, please feel free to check it out.
The code creates a REST API with a large number of endpoints and methods, all created in multiple nested stacks. Deployment of those stacks always fails, returning a too many requests error, also rollback fails for the same reason.
I could temporarily avoid this problem by decreasing the number of resources in each nested stack and making them depend on each other during deployment so they don’t get deployed in parallel, but it’s much slower and inefficient.
@nija-at I’m pretty sure what @tuanardouin is reporting is not that CDK invokes the API Gateway endpoint directly but that when the CloudFormation template is deployed, the service-to-service communication between CloudFormation and API Gateway gets rate-limited and the CloudFormation deploy fails (and reflects that rate-limit error). It’s unclear to the user that this is the source of the error they see. If I’m correct about the source of this error (and it’s not possible for the user to get any more information), this is actually an upstream bug in CloudFormation (since CDK can’t control the behavior of CloudFormation), and I imagine it would be SUPER-HELPFUL if the CDK team could bubble this bug up to the CloudFormation team. It’s not the first time they’ve heard about this long-standing bug https://forums.aws.amazon.com/thread.jspa?threadID=100414, and they don’t appear to have taken any steps to solve it.
Like someone already said, the problem is the rate limit of the API Gateway’s own APIs. The CreateResource is limited to 5 per second per account.
We’re facing the same problem with the Serveless Framework. Nobody solved it properly. AWS premium support suggests introducing DependsOn, but it’s not a definitive solution for sure. The 2nd link below shows AWS published a private resource type Community::CloudFormation::Delay, which also doesn’t feel like a definitive solution alone. We thought of using WaitCondition, but it’s about the same. I believe AWS should be able to handle the throttling between its service calls transparently. The “user” is not making these service-to-service calls. IMHO, since the “user” is providing a valid template that could be fully deployed, if we ignore the rate limit for APIs that the “user” itself is not calling, it should work flawlessly. However, it may become a hard optimization problem to solve.
This issue is related to:
We’re not the CloudFormation team, so we cannot answer these questions. There’s no action CDK can take here with our construct library - While this bug persists, it will be up to customers to configure dependencies between the resources they create to ensure they deploy sequentially rather than in parallel. See this comment for an example
I’ve created a ticket internally to make sure the right team sees this. I’ll provide updates when they become available P88246032
@oanhhuynhpositive oh, my bad. I mixed both repos (cdk and serverless) as we’ve been dealing with the same issue.
@nija-at Oh, we definitely have example stacks where this happens consistently. We are a pretty small team, so there’s usually max 1 stack being deployed at any time, and we don’t have any custom resources on the stacks where this happens – as you suggest, it is all about the number of API Gateway resources. But like I said, when we use CloudFormation to work with these resources, we have no ability to adapt to API Gateway rate limits for resources that CloudFormation is managing.
This would be really helpful. I would be happy to work with the team (I think it would be CloudFormation) to help isolate this issue. Please let me know what you need from me. You can reach me by email at my GH username at gmail.