external-dns: AWS API InvalidClientTokenId: The security token included in the request is invalid
I am trying to bring External-DNS up with the stable helm chart. I have tried both creating and not creating the RBAC service account. In any case, it seems that the pod is not able to communicate with the K8S API.
Entered the pod and the token does look fine. In fact if I extract the token from the running pod and add it into my kubeconfig, then I am able to execute kubectl commands to see pods and services.
time="2018-05-22T20:55:09Z" level=info msg="config: {Master: KubeConfig: Sources:[service ingress] Namespace: AnnotationFilter: FQDNTemplate: CombineFQDNAndAnnotation:false Compatibility: PublishInternal:false Provider:aws GoogleProject: DomainFilter:[] ZoneIDFilter:[] AWSZoneType:public AWSAssumeRole: AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: CloudflareProxied:false InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 InMemoryZones:[] PDNSServer:http://localhost:8081 PDNSAPIKey: Policy:upsert-only Registry:txt TXTOwnerID:default TXTPrefix: Interval:1m0s Once:false DryRun:false LogFormat:text MetricsAddress::7979 LogLevel:debug}"
time="2018-05-22T20:55:09Z" level=info msg="Connected to cluster at https://172.20.0.1:443"
time="2018-05-22T20:55:15Z" level=error msg="InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403, request id: 6d439029-5e02-11e8-b984-c1a86fb37b37"
time="2018-05-22T20:56:25Z" level=error msg="InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403, request id: 9749c5b3-5e02-11e8-9a05-e706f97ff7c1"
time="2018-05-22T20:57:26Z" level=error msg="InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403, request id: bb533cb9-5e02-11e8-a805-33dea8fc8b7d"
time="2018-05-22T20:58:26Z" level=error msg="InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403, request id: df5adf50-5e02-11e8-ae9c-9dffa2727d2c"
time="2018-05-22T20:59:31Z" level=error msg="InvalidClientTokenId: The security token included in the request is invalid.\n\tstatus code: 403, request id: 064ea3f5-5e03-11e8-9a05-e706f97ff7c1"
When looking at it from the API side, I am seeing errors with the certificate. Yet the certificate in the pod is the same certificate that is used elsewhere and in other pods.
I0522 21:33:24.241320 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50626: remote error: tls: bad certificate
I0522 21:33:24.269081 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50627: remote error: tls: bad certificate
I0522 21:33:26.414586 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50628: remote error: tls: bad certificate
I0522 21:33:26.441808 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50629: remote error: tls: bad certificate
I0522 21:33:26.469308 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50630: remote error: tls: bad certificate
I0522 21:33:26.497100 1 logs.go:49] http: TLS handshake error from 10.20.0.109:50631: remote error: tls: bad certificate
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 18 (7 by maintainers)
Hi @hickey, I had faced the same error as you. But later I figured out, I made the mistake when putting the AWS credential, that’s reason for InvalidTokenError
The values in Helm is in reversed order.
secretKey
appear first (but I put the ACCESS KEY ID instead)I think you can check if you face the same issue.
Just for fun, I started up External-DNS in another cluster (v. 1.9.6) using the inmemory provider and got results much more consistent with what I would expect:
I then re-deployed External-DNS with the AWS provider (even though it was an on prem cluster) and got similar results of an invalid token. So given this latest data, I retract my thoughts that this is a problem with communicating with the Kubernetes API server.
I also expanded my filter for the tcpdump to only look at packets going to port 443 (feasible since the cluster in question has virtually no SSL traffic running across it) and found that there are packets going to IPs similar to the IP for route53.amazonaws.com but not the same IP that is registered in DNS. So that would also tend to confirm that the Route53 API is being contacted, but how the IP is being determined is beyond me at the moment.
I will try to spend a bit of time tomorrow digging into the AWS provider code to see what can be done to provide better logging (and debugging) into the code. Maybe from that it will become more apparent why the failures are being seen.
Thanks @chrisduong . You are right, and I think this is quite confusing and easy to make mistake. So it is better to add some comments for those two fields, or just use
aws_access_key_id
andaws_secret_access_key
instead ofaccessKey
andsecretKey
.I have put some initial additional logging into the AWS provider code and can not find how or where the settings from the chart
values.yaml
file get applied to the provider–specifically the AWS key and secret key. From what I can tell, these values are never used by the code.I have not been able to confirm this yet, but if I start external-dns without specifying the keys in the
values.yaml
file it seems that everything works as expected. I have an environment that I need to rebuild and will try an experiment by specifying the keys again and see if I have external-dns fail to create Route53 entries again.If I am successful in the experiment and can show that specifying the keys do prevent external-dns from operating (which I don’t understand if they are really not being used by the code), then I would recommend that the entries be removed from the helm chart until code is written to handle them properly.
Hopefully more results soon to confirm or deny my initial findings.