aws-sdk-ruby: Lightsail credentials fetching from IMDS is slow

Confirm by changing [ ] to [x] below to ensure that it’s a bug:

Describe the bug AWS STS takes seconds to respond to credential requests. This is only my assumption; check below for the observed behavior.

Gem name aws-sdk-s3 (1.111.2)

Version of Ruby, OS environment ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x86_64-linux]

To Reproduce (observed behavior) I am running a docker container on Lightsail which makes use of Lightsail object storage. I granted access to the storage via “Resource access” by attaching Lightsail instances to the bucket. I reused the S3 client properly:

class S3ClientCache
  include ::Singleton

  def s3_client
    @_s3_client ||= ::Aws::S3::Client.new(
      region: ENV['S3_REGION'],
    )
  end
end

S3ClientCache.instance.s3_client

On the very first S3 request it will take 2 seconds in my environment to presign an S3 URL. This is also true for subsequent credential updates. Here I capture the moment when X-Amz-Credential changes:

I, [2022-03-08T22:45:12.250483 #80] INFO – : [4f8ad353-b102-42c1-ac04-c366db624dc6] Processing by Tenant::APIController#entry as HTML I, [2022-03-08T22:45:12.250549 #80] INFO – : [4f8ad353-b102-42c1-ac04-c366db624dc6] Parameters: {“tenant”=>“MamyPoko”} I, [2022-03-08T22:45:12.258026 #80] INFO – : [4f8ad353-b102-42c1-ac04-c366db624dc6] Redirected to https://bucket-example.s3.ap-southeast-1.amazonaws.com/path/to/file/cfb936de1799956e43497e24143112d280ee0a37542860a2.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQ2SOE3UGR6ERY4BJ%2F20220308%2Fap-southeast-1%2Fs3%2Faws4_request&X-Amz-Date=20220308T224512Z&X-Amz-Expires=180&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEFkaDmFwLXNvdXRoZWFzdC0xIkgwRgIhAMfBG5GLgaN%2BgW11Kmw%2FXCoJ1FExxRw4SnVweJgdDnHSAiEAo%2Bl9MeG6kBirNBxakZdbk%2BzZX2uDIvv8fYC1CuGD5qkqjQQIwv%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FARACGgwwNTcwNzIxNjYxNTciDPvzW5DkJutlREbflCrhAx%2FO%2F3I7KGHpNBYBqpsq%2B8a4L%2B7kBimTanjBCJdBy%2Bvjo3%2B73D32qhV6A71wCVz1tCQJBVS7oSLcle%2FFMXTA7lfBU%2FOhaxU%2FXJiVukJjTh4KpTZcwvgichUVxALagWFQAfDk2ks2%2FUzpYCD1T76%2Fqfow4PxrtlBNazH17%2BJoJ6kmlfgUxraCnhfX%2F%2Bg3%2BorcOFS5fl5Urj%2F94tPWBINwcOiS5K239NazDMOGsDopfFsE3hl5hm8084kUGQSI1PFSeX2pXqiuGsNiN%2FqEPuKz2z7gAIr7VFVZqzJ9tzuHQkWsXLsd6Ug8D7QO%2B3dLEROo3Kb1iRq%2BnP7VGW4YIN7pV2E6hehm0%2BoCdY8xS8DJ20%2BZf0EnuMPU%2BVcv%2F12%2Bc8BJNHpUDxMVLICC9cG2r%2Ff%2BvuRcB50xKlQ87LGEbbqTSgxh8qELIqk9xm%2FqqO%2BOk1uoAYV2tUQ%2B0JX2%2BLeSKz5zc8j6xUqXHY7OP%2B%2BLeGKQhuLicLVFqxWLSUuGCB7sZ8TbYW0fMdLJqCtgp1Kf9HkIO8N4lwH3QU%2FnMOzQRIpT42BvZ4FA6TF%2FpbCrlFIXFkFONMuvxwUAmCjymTuSjzfJ0lHipUtawVbF0y5X4Vi0dzfTZpCqodHto45xbuKsWWYT7F4wr4yekQY6pAFVWYuE7ZIXKCT8c8n3TwLtV%2BB%2BvrCq3n%2FlpJicdeE%2FdcUxftOqnlEWPFpwP%2FZBEPCi%2FLduFlt8g1mHdnfe6tDt1jPlrlsI0VkQHK3hLFjJKnkdxe1MV8RsgF85x39gXOsTZ9%2B%2FfxIuLiNaCxAKQG40W9wfs1eevF3CbZsqoWeE8S%2B7AtNVBbgNl7b54voduEyZ1zcngI%2FmlbR29gcW%2FBgA%2Fc4cOA%3D%3D&X-Amz-SignedHeaders=host&X-Amz-Signature=bddf6357fd9d119cec222e5d336d79c8e0ec3b26e5c41a9da9ea4636e1d2be11 I, [2022-03-08T22:45:12.258229 #80] INFO – : [4f8ad353-b102-42c1-ac04-c366db624dc6] Completed 302 Found in 8ms (MongoDB: 0.0ms | Allocations: 2463) – I, [2022-03-08T22:46:10.726935 #73] INFO – : [b1809e63-b1b6-4598-86ec-e03dfa84278d] Processing by Tenant::APIController#entry as HTML I, [2022-03-08T22:46:10.727178 #73] INFO – : [b1809e63-b1b6-4598-86ec-e03dfa84278d] Parameters: {“tenant”=>“MamyPoko”} I, [2022-03-08T22:46:12.750322 #73] INFO – : [b1809e63-b1b6-4598-86ec-e03dfa84278d] Redirected to https://bucket-example.s3.ap-southeast-1.amazonaws.com/path/to/file/eecba882f9292553e49cca4c1842600b8a020f5dfb78d4bf.jpg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIAQ2SOE3UGYS2UJBEZ%2F20220308%2Fap-southeast-1%2Fs3%2Faws4_request&X-Amz-Date=20220308T224612Z&X-Amz-Expires=180&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEF8aDmFwLXNvdXRoZWFzdC0xIkcwRQIhALO%2Bs19ZZLZ3N7N4p9uBRc2FStbeyhvUigHmb1lKWjoDAiB1iIGhcYdF0IvSu5KAHMRfO6t37tQ4lpTqIBU1MWEY6CqNBAjI%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F8BEAIaDDA1NzA3MjE2NjE1NyIMDC5KhdK%2F3kA6H6TrKuEDiLln50%2BdoQIby73nGAAa%2Fw4PQkPs6DJzZccOPkxxaPY7OhWTnwb4iNHm6HwP6u%2BBHHTg7qNqdatGsUGlU1Rm3p6NmICU49lShcXpB3rzxBHVjK56mTtnVhPjOHEjhkuTeucdtmLLtVSaaiw9CvuyNDnrFlyWOSM1TFskLT0hZ%2Fpqy548iHpFgOieCDgzpccrGoR3UjscBab7%2FM2sYJzHjQFGH3H%2BQSCjocA5rEGuPDnKucS1cJ7KZ2Yd6keGqEWWvkZjHg1s4zIdoJIuxGqXvmJafJlmMaR%2FOWFVI5y8Zz%2FOamwa%2FWDaEvPBT6yZixCt0gheXmfM%2BDe89xNjwdtuZPNmMfsQ07il6wcaRVB6YDU9aF%2F76l0V2wPupVGHpHhA%2Fnwir8xv%2BHB46jx8Y7aYx1E3KMfktndtCkZx%2FD3TPp%2Fu1YmJpZ0b4rK1qwNuMKwtfp7BOA8Bow53Ik05hdwwqQa42dDc92S8sQJl7%2F7haAD4wG6JGK%2BVOJVIeid%2F%2Bve5tvmX2ZBur2KHi2JjvKLAnTWFu8oToYIo9zoMP789L0ZBnTXgIBB8Pwc%2F9ZcpzvHEMdX74HF86N7ppoChaygIH30MxwZ9wNJfUvyXTt0PehXBU3LuwbRyWwjTkiX0p%2FKiwDDDs5%2BRBjqlARNwiIesLwp2EzwzCa1AhUl5fGG3dHF7bZ%2Fh8B8P6PmRhzzXaxVAX5ZtEX5IObQxzSMZ2GhnjEQuv0g2oJIHrUO6YbHumilPUZ%2BbO8HKruC5ANaRm1BX0P8jc%2BepKCvt6o5gyNUoguM5VuM0Nc%2BJeH%2FSBvj2%2FIMXPRGxnOqmOg%2FDwDcxBMdz5820WzCSobdKO7GDKDQtIj0rrkwKOmrjQHd5Qyps5A%3D%3D&X-Amz-SignedHeaders=host&X-Amz-Signature=dcb5668939995c17c148491dc879acd38a4c16d70dfbca0d2806a1b5e890e667 I, [2022-03-08T22:46:12.750548 #73] INFO – : [b1809e63-b1b6-4598-86ec-e03dfa84278d] Completed 302 Found in 2023ms (MongoDB: 0.0ms | Allocations: 4046)

The regular one takes only 8ms to presign and redirect. The one with STS credential update takes 2023ms. Please note that both use the same S3 client instance to presign. Is this a performance bug? 2 seconds seems too long for me, even if it happens occasionally.

Expected behavior Faster response from aws-sdk-s3.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 26 (24 by maintainers)

Most upvoted comments

Thanks for the offering on Lightsail reconfiguration; I really do appreciate it. Still, I prefer regular instances to special treatment, so people who maintain the servers after me would have an easier time figuring things out.

Some of our services are running in a regular VPC; we still need IPSec peering and stuff. However, we are in the direction of another way around; we think EC2 is overly complicated for our workload and start moving over to Lightsail. Lightsail is actually doing a better job than our self-managed VPC setup on some aspects. Weighing the cost-benefit, I don’t think that is the way to go for us.

Option 3 seems to be the way to go for most people. If someone comes to this thread from Google, that is probably what you are looking for. I also want to add that you probably need to config http_read_timeout instead of http_open_timeout because the packet that has a low hop limit is the one containing the secret, which excludes HTTP header and stuff.

Otherwise, TTL mangling is relatively safe too. The reason is that the condition -s 169.254.169.254 -d 172.26.0.0/16 ensures that at least one additional node is required to form a loop, and that node will be the one that decreases the TTL value. --ttl-inc 1 is unlikely to flood the network, but you may need to double-check your setup if you want more hops.

I understand every party, including the Lightsail team who decide against more ec2 permissions. Kudos to the Ruby SDK team for arranging things out.

I’ve talked to some members of the Lightsail team and also poked around a bit. Per my understanding as a background, Lightsail EC2 instances (like all EC2 instances) have linked IAM role (AmazonLightsailInstanceRole/instance-id). You can view it by calling aws sts get-caller-identity from an EC2 instance. The credentials returned by IMDS (instance metadata service) can be fetched with curl http://169.254.169.254/latest/meta-data/iam/security-credentials/AmazonLightsailInstanceRole, and the SDK tries to fetch and use these credentials. The issue is that, the SDK cannot fetch these using IMDSv2 in a container unless you modify your hop limit. I tried convincing the Lightsail team to blanket permissions to this role for EC2’s modify metadata operation but they declined. So that leaves us with a few options.

Option 1) Lightsail will modify the metadata on your behalf if you can provide a customer ID, region, instance ID, and the value of the hop limit (depends on your container routing). The downside of this option, is that it’s manual and new instances will not have your hop limit. If you want to do this, I can give you my email or some way to securely provide me that information.

Option 2, preferred IMO) Migrate your containers to EC2 instances without Lightsail. You can export snapshots but you will have to wire up storage and other expected features. The Lightsail engineer said that your use case may be more complex than Lightsail wants to offer. If you are running your own containers, you may as well just run them on vanilla EC2 instances, and then you have full control over them. You can then use the AWS CLI to modify your instance metadata.

Option 3) No architecture changes, and try to fail fast to use V1 credential fetching. It does seem like you expect your credentials from IMDS on the EC2 host. You can initialize an instance of Aws::InstanceProfileCredentials with 0 retries, and a smaller http_open_timeout, and pass it to your S3 client: Aws::S3::Client.new(credentials: Aws::InstanceProfileCredentials.new(..)). The major benefit is that this is “quick and dirty” and not necessarily a solution to the problem.