yet-another-cloudwatch-exporter: Throttling: Rate exceeded

Hello,

faced with panic in one of the regions: `panic: Throttling: Rate exceeded status code: 400, request id: 2eac2d08-4014-11e9-8a89-991f83fb4554

goroutine 8238 [running]: main.cloudwatchInterface.get(0xa3d040, 0xc420276018, 0xc4215abe80, 0xc422c00c60, 0xc42009f150, 0xe) /go/src/yace/aws_cloudwatch.go:112 +0x17d main.scrapeDiscoveryJob.func1.1(0xc42025d680, 0xc420d00410, 0xc4259bba00, 0x2, 0x2, 0xc422c3d6d0, 0x1, 0x1, 0xa3d040, 0xc420276018, …) /go/src/yace/abstract.go:140 +0x1dd created by main.scrapeDiscoveryJob.func1 /go/src/yace/abstract.go:123 +0x15d`

Probably it makes sense to handle this via some configuration option. Like: “max concurrent requests” or something.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 2
  • Comments: 27 (21 by maintainers)

Most upvoted comments

Thanks a lot, @thomaspeitz. Unfortunately, I don’t have such a big env now. Previously we had hundreds of Jenkins Agents. And getting metric for them caused this issue. Now we moved Jenkins into Kubernetes along with its agents. So I don’t have a good test field for experiment. But I definitely will install the latest version across all my environments

@tsupertramp I am catching up your updates. already several versions behind. wait for our internal “Open Source Contribution” policy 😃 and then PR

Will need to skip this to weekend. Already spent two hours on alb research…

@tsupertramp I used an extra parameter called maxConcurrency to limit the number of goroutines. Here is the notes I used in my implementation https://gitlab.com/wutianchen/ds-codebase/blob/master/Golang/patterns.md . If you need me we can also talk shortly on Friday 😃 best

@tsupertramp @arnitolog encountered the same problem. solved with limiting the number of go routines

Thanks for all the data!

Okay, i will think how we can solve this. We need to reduce the api calls to fix this. Not sure maybe through grouping more api calls into one.

Will take some time but challenge accepted to solve this.

And great to have someone using it in such a big setup! 😃

@tsupertramp it will show nothing due to my scrape interval is equal 1m. But here are values for 2 and 5 minutes: delta(yace_cloudwatch_requests_total{aws_cluster=“us-east-1-cicd”,job=“yace_cloudwatch”}[2m]) 4890 delta(yace_cloudwatch_requests_total{aws_cluster=“us-east-1-cicd”,job=“yace_cloudwatch”}[5m]) 12247