aws-sdk-ruby: aws-sdk-core > 3.78.0 slows down the fetching of credentials

Issue description

Ever since moving from aws-sdk-core version 3.78.0 to 3.79.0 the way credentials are fetched have changed from:

curl -H "User-Agent aws-sdk-ruby3/3.78.0" "http://169.254.169.254/latest/meta-data/iam/security-credentials/"
curl -H "User-Agent aws-sdk-ruby3/3.78.0" "http://169.254.169.254/latest/meta-data/iam/security-credentials/<response-from-first-get>"

to:

curl -v -H "User-Agent aws-sdk-ruby3/3.79.0" -H "x-aws-ec2-metadata-token-ttl-seconds 21600" -X PUT http://169.254.169.254/latest/api/token

I guess that has it’s reasons, but this PUT request returns a 400 at my end, which in turn causes it to retry 5 times (?). After that unsuccessful chain of events it falls back to the old way of fetching credentials. This is of course painfully slow. Is there some special magic required to not make it return a 400?

Gem name (‘aws-sdk’, ‘aws-sdk-resources’ or service gems like ‘aws-sdk-s3’) and its version

aws-sdk-core 3.79.0 and up

Version of Ruby, OS environment

Ruby 2.6.0 Kubernetes/Docker

Code snippets / steps to reproduce

See curl examples

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 2
  • Comments: 27 (12 by maintainers)

Most upvoted comments

If you want to pin aws-sdk-core to 3.78.0 to avoid IMDSv2, then you’ll also need to pin aws-sdk-s3 to 1.57.0 to avoid #s3_use_arn_region which was introduced in aws-sdk-core after 3.78.0

I’m experiencing the same delays when using aws-sdk-core with aws-sdk-s3 to download files from s3. When I use ENV['AWS_EC2_METADATA_DISABLED'] = "true", I get Aws::Sigv4::Errors::MissingCredentialsError when initializing an s3 client.

We’re having to downgrade to 3.78 to use our api without significant latency but that breaks downloading from s3 in this way: Undefined method 's3_use_arn_region' for #<Aws::SharedConfig:x0...>

3.85.1 did not help.

I recently upgraded to 3.85.1 . From our performance monitoring tool requests to AWS are still slow. Above all it still retries 5 times:

Screenshot 2019-12-12 at 12 18 23

So I released a version 1.1.28 of our app which only includes the aws-sdk-core 3.85.1 update and the speed goes from 39ms to 7.5s

For the 39ms these are the amount of requests to AWS (GET http://169.254.169.254):

Screenshot 2019-12-12 at 12 19 29

For the 7.5s request this is what happens:

Screenshot 2019-12-12 at 12 20 35

Again 6 PUT http://169.254.169.254 and 2 GET http://169.254.169.254

The main thing I can conclude from this is this Ruby gem still retries 5 times and than proceed to fallback to the old way. I’m not sure exactly how or why, but it’s the way it is.

It’s pretty trivial to reproduce assuming you have a k8s cluster with kube2iam, kiam, or similar IMDS proxy enabled. Run a new container and start a shell which has enough ruby tools to run bundler and install the aws-sdk-core. I’m using ./kubectl run bryan --rm -i --tty --image=fingershock/ruby:2.5.7-builder --restart=Never sh but there are many ways to do this.

IMDSv1 sdk 3.78.0 works fast … just a few miliseconds

$> cat Gemfile
source "https://rubygems.org"
gem 'json'
gem 'aws-sdk-core', '3.78.0'

$> bundle install
... output elided ...

$> bundle exec ruby -e 'require "benchmark"; require "aws-sdk-core"; puts Benchmark.measure { pp Aws::InstanceProfileCredentials.new }'
#<Aws::InstanceProfileCredentials:0x000055c9f525a480
 @backoff=
  #<Proc:0x000055c9f525a3e0@/usr/lib/ruby/gems/2.5.0/gems/aws-sdk-core-3.78.0/lib/aws-sdk-core/instance_profile_credentials.rb:65 (lambda)>,
 @credentials=#<Aws::Credentials access_key_id="ASIA6D4HKVYTA5JSVVGB">,
 @expiration=2019-12-06 23:21:28 UTC,
 @http_debug_output=nil,
 @http_open_timeout=5,
 @http_read_timeout=5,
 @ip_address="169.254.169.254",
 @mutex=#<Thread::Mutex:0x000055c9f525a340>,
 @port=80,
 @retries=5>
  0.005531   0.000000   0.005531 (  0.034940)

Newer sdk which forces use of IMDSv2

$> cat Gemfile
source "https://rubygems.org"
gem 'json'
gem 'aws-sdk-core', '3.79.0'

$> bundle install
... output elided ...

$> bundle exec ruby -e 'require "benchmark"; require "aws-sdk-core"; puts Benchmark.measure { pp Aws::InstanceProfileCredentials.new }'
#<Aws::InstanceProfileCredentials:0x0000555ad9fa35c8
 @backoff=
  #<Proc:0x0000555ad9fa3488@/usr/lib/ruby/gems/2.5.0/gems/aws-sdk-core-3.79.0/lib/aws-sdk-core/instance_profile_credentials.rb:83 (lambda)>,
 @credentials=#<Aws::Credentials access_key_id="ASIA6D4HKVYTA5JSVVGB">,
 @expiration=2019-12-06 23:21:28 UTC,
 @http_debug_output=nil,
 @http_open_timeout=5,
 @http_read_timeout=5,
 @ip_address="169.254.169.254",
 @mutex=#<Thread::Mutex:0x0000555ad9fa33e8>,
 @port=80,
 @retries=5,
 @token=nil,
 @token_ttl=21600>
  0.008241   0.000839   0.009080 (  7.457265)

Thanks for your answer. I’m not sure how and if that is configured:

Are you by chance setting instance_profile_credentials_retries in your configuration? You could also try configuring instance_profile_credentials_timeout. Does my example best verify your case?

I could give it a try and see if it fixes the speed problems. I’m okay if it does one extra request I guess 💭 . I’ll get back to you.


Does my example best verify your case?

Yes. This does verify my case. However in my case the retries default is set to 5 (as it reads here)