aws-cdk: Throttling: Rate exceeded
When deploying multiple CDK stacks simultaneously, a throttling error occurs when trying to check the status of the stack. The CloudFormation runs just fine, but CDK returns an error because the rate limit was exceeded.
We’re using typescript.
The issue #1647 says that this error was resolved, but looking at the fix (#2053), it only increased the default number of retries, just making it less likely to happen.
Is there at least a way to override the base retryOptions in a CDK project? If there is, I can just override it in my side so the error does not occurs.
Even if there is, I think that this should be solved in the base project. I don’t think CDK should ever fail because of rate limiting while trying to check the stack status in CloudFormation, as it does not affect the end-result (the deployment of the stack).
Use Case
One of our applications have one CDK stack per customer (27 in total). When there’s an important fix that needs to be sent to every customer, we run the cdk deploy command for each stack, simultaneously, via a Jenkins pipeline.
Error Log
00:03:13 ❌ MyStackName failed: Throttling: Rate exceeded
00:03:13 Rate exceeded
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 74
- Comments: 52 (13 by maintainers)
Commits related to this issue
- fix(toolkit): CLI tool fails on CloudFormation Throttling The CDK (particularly, `cdk deploy`) might crash after getting throttled by CloudFormation, after the default configured 6 retries has been r... — committed to aws/aws-cdk by RomainMuller 4 years ago
- fix(toolkit): CLI tool fails on CloudFormation Throttling The CDK (particularly, `cdk deploy`) might crash after getting throttled by CloudFormation, after the default configured 6 retries has been r... — committed to aws/aws-cdk by RomainMuller 4 years ago
- fix(toolkit): CLI tool fails on CloudFormation Throttling (#8711) The CDK (particularly, `cdk deploy`) might crash after getting throttled by CloudFormation, after the default configured 6 retries h... — committed to aws/aws-cdk by RomainMuller 4 years ago
We hit this issue regularly and it is getting really annoying 🤨
Last build 2 of 10 stacks failed with the “Throttling: Rate exceeded” error …a retrigger of the CICD pipeline will most likely succeed!
Just to add another voice to this This is affecting my team as well.
In particular we have several CDK apps which creates over 100 stacks each If more then one of these apps are deploying at once time they fail with the rate exceeded message and just exit failing our CI build with no apparent retries.
Can we please get some attention on this issue? My team is suffering from this.
Ran into something very similar as @calebpalmer, where I was creating ~20 lambdas with a custom log retention (
logRetentionprop ofFunctionconstruct). Similarly, adding a few explicit dependencies remediated the issue.In my case, CDK created a
Custom::LogRetentionresource for each function, and it failed withRate exceedederrors in CloudFormation on the 16th resource and onwards.Coincidentally there’s a non-adjustable Lambda quota limit (
Rate of control plane API requests) of 15 API operations (per second?). Perhaps it’s related.As this is clearly persisting, I am reopening the issue for further discussion
We love the CDK, but throttling from CloudFormation when using the CDK is still an issue for our team. We often have CDK builds/deploys running for different stages/stacks in our AWS Account at once and we run into this quite often.
We’re currently on CDK version
1.54.0@Silverwolf90 that does not sound ideal at all and we should be providing a better experience natively.
bumping this up to a
p1as it’s affecting a lot of our users.We are also seeing this issue, hoping we could have a way to configure the retries.
Is the reason for not having a configurable retry (via environment variables, config file) is because CDK is using the aws-sdk v2?
Hopefully, this could also be fixed in CDK v2
We also have the same issue in our team, the #8711 did not fix the issue for us. I would like to see a simple option to control the poll interval for the cdk cli command in order to avoid exceeding the rate.
I don’t know if this is related, but I’ve started seeing similar throttling errors in a single stack when trying to create an IAM role within a stack:
picking this task up
Also seeing this issue. Like @clifflaschet, my problem seems related to introducing a log retention policy to existing stack with lambdas.
What about avoiding polling altogether while scaling to a large number of stacks in parallel - have a CDK service endpoint which CDK clients would subscribe to. Once a stack is finished deploying, the client will get (event-driven) notification and continue to the next stack.
I originally thought my rate limiting issue was from this but it actually ended up being rate limits reached between cloudformation itself and the services it was interacting with. For example I had a ton of independent lambda functions being created at the same time which caused rate limit errors between cloudformation and lambda. After adding some explicit dependencies between the lambdas it reduced the amount of them being created in parallel and eliminated the rate limiting issues. There were some other resources I had to do the same with like API Gateway models and methods. I’m mentioning this in case someone else in here might have came to the wrong conclusion like I did.
@RomainMuller @SomayaB @shivlaks Can you please reopen this issue? It clearly is not fixed yet. I’m using an updated version but it still happens the same way it did when I first opened this issue. You can also see a lot of people seeing the same error after this was closed.
There was an IAM issue overnight but it appears to be resolved or is in the process of resolving now
Hi @danielfariati, thanks for reporting this. We will update this issue when there is movement.
I’m getting this error in CDK v2
Given that the error is retryable, maybe retry it before blowing up?
Please bump in priority. This issue is blocking us as well.
Same here 😦 I’m just trying to describe the cloudformation stacks and loop through them but keep encountering this error often.
Throttling: Rate exceeded at Request.extractError at Request.callListeners at Request.emit at Request.emit at Request.transition at AcceptorStateMachine.runToI’m also seeing this. It’s driving me nuts. I have to comment stuff out and do multiple deploys to get my stack up which also means I can’t do a full deploy from a CI/CD system.
@richardhboyd
I’m still exploring the options, but some of the things we are considering include:
The downside of bailing on the stack monitoring is subsequent deploys will not be initiated by the CDK. i.e. if stack B had a dependency that required stack A to be deployed. We can’t start that deployment until A has completed. That would not be possible if we stopped monitoring.
This would affect wildcard deployments and any scenario where we can’t reason about the status of the stack without polling.
Handling rate limiting more gracefully is a precursor to attempting parallel deployments. Let me know if you had any additional thoughts, and I’ll work that in as I’m trying to prototype a proof of concept and test out the tradeoffs.
This is my understanding, so please don’t 🔥 me 😉
This issue has to do with CDK CLI being throttled because it hits the CloudFormation API too often and there is no way to override the defaults. This happens more often if you deploy multiple stacks in parallel.
The other rate limiting problem that folks are seeing is related to Custom resources in the stack itself. Primarily, with log retention. CloudWatch Logs has really low API limits (like 5 reqs/sec). You can fix this by using the
logRetentionRetryOptions(docs). Lambda function construct has this option as well.currently, we are running 11 deployments in parallel. we faced the issue the first time that Rate exceeded. day by day the parallel deployment count has been increasing.
any alternate solution is there is right now? because we don’t want to fail our pipeline for 1 2 deployments.
Thanks.
Just tried again updating 35 stacks in parallel and the issue still persists. Our end goal is to be able to deploy all stacks at once (~80, and growing… it was 27 when I first opened this ticket). In our case, the error is not on CloudFormation, like some users are reporting, but on CDK itself (probably calling CloudFormation to get the stack status). The stack on CloudFormation updates successfully, but CDK breaks (Rate Limit Exception), so the person running it lose track of what is happening. We are using CDK v2.
The error looks like this:
Any news on this? The issue is tagged as p1, but it seems that nobody is looking into it. The issue is affecting us for more than 2 years now, and no workaround is possible (that I know of). 😢
What we’re currently doing is limiting 15 stacks in parallel, but this is becoming a huge problem, as our number of stacks is growing…
This has been plaguing our deploys for ages! Thank you for following up about #8257!
On Mon, Aug 7, 2023 at 12:24 AM Shane Argo @.***> wrote:
I guess one way to solve this for good would be to create a utility API Gateway Websocket (the ones that API gateway supports natively), as part of the Bootstrap stack and subscribe to events on that one through the CDK CLI. That would help CDK drop the polling approach and would help the CLI get immediate feedback when a stack is deployed or fails to be deployed + a bonus - no throttling (since there are no longer any direct AWS API calls involved.
Side note: A third party library cdk-watch already does something similar under the hood. Talking about the CLI + Websocket API integration for realtime updates.