external-dns: Performance regression with the AWS provider in 0.5.9
Hello
We tried to update our deployment to 0.5.9 but we noticed a significant performance regression due to https://github.com/kubernetes-incubator/external-dns/pull/742
After this PR, for each change we call Records()
instead of calling it once per plan https://github.com/kubernetes-incubator/external-dns/blob/v0.5.9/provider/aws.go#L367 which will retrieve all records for the zone.
A simple fix could be to store the result of the Records() call in the AWSProvider. I can PR this if you think it makes sense.
In addition, the call to ListResourceRecordSetsPages
in the Records() function is paginated but the aws go sdk does not take into account rate limits. My understanding is that each page is 100 records which means on large zone we can quickly hit the rate limit which is 5 rps (from https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html#limits-api-requests)
We could retry and backoff on rate-limit errors but given how low the limit is and how simple the call pattern is, we could also simply sleep for 200ms (or 250ms to be safe, or even make it configurable) at the end of the callback.
Laurent
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 4
- Comments: 17 (12 by maintainers)
I think this is affecting us as well. However, we didn’t experience this until
0.5.10
. Upon upgrading to0.5.10
, ALL domains in the cluster were updated. I saw this happen in both our development and staging clusters. However, we didn’t experience any issues until I deployed this in staging. Now both clusters are complaining ofgetting records failed: Throttling: Rate exceeded
.I’m not sure why any of the domains needed to be updated anyways given there were no changes other than upgrading from
0.5.9
to0.5.10
. In0.5.9
we only really sawAll records are already up to date
as nothing was changed.Update: downgrading to
0.5.9
resolves the issue I am experiencing. Downgrading returned us back to the all records are already up to date message instead of it trying to update every record.