aws-sdk-go: Get credentials from role has become slow

Please fill out the sections below to help us address your issue.

Version of AWS SDK for Go?

v1.25.41

Version of Go (go version)?

go version go1.11.1 linux/amd64

What issue did you see?

Getting credentials from an ec2 role takes 20 seconds.

We have this issue on consistently on multiple different ec2 instances since a couple of days ago.

To me it looks like an issue with the service, but I am not sure.

Steps to reproduce

package slow

import (
	"fmt"
	"time"

	"github.com/aws/aws-sdk-go/aws/credentials"
	"github.com/aws/aws-sdk-go/aws/credentials/ec2rolecreds"
	"github.com/aws/aws-sdk-go/aws/ec2metadata"
	"github.com/aws/aws-sdk-go/aws/session"
)

func getCredentialsFromRole() (*credentials.Credentials, error) {
	roleProvider := &ec2rolecreds.EC2RoleProvider{
		Client: ec2metadata.New(session.New()),
	}
	creds := credentials.NewCredentials(roleProvider)

	start := time.Now().UTC()
	if _, err := creds.Get(); err != nil { // this takes 20 seconds
		return nil, err
	}
	fmt.Printf("getting credentails from role took %s\n", time.Now().UTC().Sub(start))

	return creds, nil
}

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 30
  • Comments: 22 (4 by maintainers)

Commits related to this issue

Most upvoted comments

Our problem is the same as described above.

aws ec2 modify-instance-metadata-options --instance-id i-34215432543254235 --http-endpoint enabled --http-put-response-hop-limit 2 is a temporary solution.

I really hope their will be a better solution in the future. Running Go with this sdk in a docker container on an EC2 instance is one of the most common use cases of this sdk that I can think of. This really should just work out of the box.

@alexd765 's solution worked for me - thanks Alex!

You can implement his solution in terraform like so:

resource "aws_instance" "your_ec2_instance_name" {
  params_go_here                     = var.blablabla
  
  metadata_options {
    # So docker can access ec2 metadata
    # see https://github.com/aws/aws-sdk-go/issues/2972
    http_put_response_hop_limit = 2
  }
}

Also note that it’s not just go that runs into this problem - even just using curl as described in https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html results in a long delay.

I’m also seeing this issue (I believe it’s the same) and I’ve been able to pin point v1.25.38 as the culprit release.

Before v1.25.38 I can see the following requests being made by the sdk (when running on a docker container on EC2):

DEBU[2019-12-02T15:53:48.399Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetMetadata Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nGET /latest/meta-data/iam/security-credentials/ HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.37 (go1.13.4; linux; amd64)\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T15:53:48.4Z] aws-sdk-message=“[DEBUG: Response ec2metadata/GetMetadata Details:\n—[ RESPONSE ]--------------------------------------\nHTTP/1.1 200 OK\r\nConnection: close\r\nContent-Length: 38\r\nAccept-Ranges: none\r\nContent-Type: text/plain\r\nDate: Mon, 02 Dec 2019 15:53:48 GMT\r\nLast-Modified: Mon, 02 Dec 2019 14:55:48 GMT\r\nServer: EC2ws\r\n\r\n\n-----------------------------------------------------]”

while on v1.25.38 and newer:

DEBU[2019-12-02T16:00:54.872Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetToken Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nPUT /latest/api/token HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.38 (go1.13.4; linux; amd64)\r\nContent-Length: 0\r\nX-Aws-Ec2-Metadata-Token-Ttl-Seconds: 21600\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T16:00:59.873Z] aws-sdk-message=“[DEBUG: Send Request ec2metadata/GetToken failed, attempt 0/3, error RequestError: send request failed\ncaused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)]” DEBU[2019-12-02T16:00:59.914Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetToken Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nPUT /latest/api/token HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.38 (go1.13.4; linux; amd64)\r\nContent-Length: 0\r\nX-Aws-Ec2-Metadata-Token-Ttl-Seconds: 21600\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T16:01:04.914Z] aws-sdk-message=“[DEBUG: Send Request ec2metadata/GetToken failed, attempt 1/3, error RequestError: send request failed\ncaused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)]” DEBU[2019-12-02T16:01:05.009Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetToken Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nPUT /latest/api/token HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.38 (go1.13.4; linux; amd64)\r\nContent-Length: 0\r\nX-Aws-Ec2-Metadata-Token-Ttl-Seconds: 21600\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T16:01:10.009Z] aws-sdk-message=“[DEBUG: Send Request ec2metadata/GetToken failed, attempt 2/3, error RequestError: send request failed\ncaused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)]” DEBU[2019-12-02T16:01:10.191Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetToken Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nPUT /latest/api/token HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.38 (go1.13.4; linux; amd64)\r\nContent-Length: 0\r\nX-Aws-Ec2-Metadata-Token-Ttl-Seconds: 21600\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T16:01:15.191Z] aws-sdk-message=“[DEBUG: Send Request ec2metadata/GetToken failed, attempt 3/3, error RequestError: send request failed\ncaused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)]” // Fallback to “/latest/meta-data/iam/security-credentials/” DEBU[2019-12-02T16:01:15.191Z] aws-sdk-message=“[DEBUG: Request ec2metadata/GetMetadata Details:\n—[ REQUEST POST-SIGN ]-----------------------------\nGET /latest/meta-data/iam/security-credentials/ HTTP/1.1\r\nHost: 169.254.169.254\r\nUser-Agent: aws-sdk-go/1.25.38 (go1.13.4; linux; amd64)\r\nAccept-Encoding: gzip\r\n\r\n\n-----------------------------------------------------]” DEBU[2019-12-02T16:01:15.192Z] aws-sdk-message=“[DEBUG: Response ec2metadata/GetMetadata Details:\n—[ RESPONSE ]--------------------------------------\nHTTP/1.1 200 OK\r\nConnection: close\r\nContent-Length: 38\r\nAccept-Ranges: none\r\nContent-Type: text/plain\r\nDate: Mon, 02 Dec 2019 16:01:15 GMT\r\nLast-Modified: Mon, 02 Dec 2019 15:55:34 GMT\r\nServer: EC2ws\r\n\r\n\n-----------------------------------------------------]”

I run this app without specifying AWS env keys nor with a shared profile.

@pmalekn - They introduced the new EC2 metadata feature in v1.25.38 - and a backwards compatibility issue by setting the default hop limit to 1 - which means the replies get dropped while transiting the docker bridge network.

You have to increase the hop limit yourself or it just hit lengthy timeouts & retries before eventual fallback to the old method

Can we remove the stale lifecycle on this one please @skmcgrail ? I’m presuming it’s still an issue and the reason people aren’t commenting on it is because of the workaround. A lot of people have ran into this problem and it would be good to have explicit confirmation that it is fixed when it gets fixed.

Running on Fargate in ECS, this is not only slow but completely broken in latest SDK (v1.33.4 at time of testing). Errors with could not find credentials configuration. Only downgrading to 1.25.37 resolves the issue, since the hop limit can not be configured for Fargate.

Tested with this https://github.com/particleflux/s3fetch/blob/master/s3fetch.go run directly in docker entrypoint.

@vincentjorgensen This article summarizes very well how this TTL thing works: https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/ see the Protecting against open layer 3 firewalls and NATs section.

We also saw similar behavior, we noticed issues on the operations

  • S3 -> ListBuckets
  • ec2MetadataService -> Region() both took 20+ seconds to complete. We noticed that going back to the previous version (for us it was v1.25.27) it was quick again. We saw the issue both in 1.25.40 and 1.25.41.

The above curl is also super-fast for us (only tested on t2.small)

Ran into the same problem using the AWS go SDK in an Elastic Beanstalk (EB) Docker environment, but didn’t see a way to set http_put_response_hop_limit in EB without manipulating the EC2 instance directly.

Fixed (or rather sidestepped) it by setting AWS user credentials and default region as environment variables - hope it helps others in the same situation.

I have a delay of 3 seconds as of v1.29.24 when trying to create a new session using this code:

sess, err := session.NewSession(&aws.Config{
	Region: aws.String(region)},
)

I am using:

Will revert to v1.25.19 until this is fixed, because manually increasing the hop limit is not an acceptable solution.

No global option I know of, though hoping AWS might see sense & implement something like https://github.com/aws/aws-sdk-go/issues/2980

In the mean time you can only stay on old version of aws-sdk or update all EC2 instances to increase the hop limit, e.g. with:

aws ec2 modify-instance-metadata-options --instance-id i-34215432543254235 --http-endpoint enabled --http-put-response-hop-limit 2