terraform-provider-kubernetes: Unauthorized after update to v1.10 using token auth
Hi there, After upgrading to version 1.10, token authentication stopped working.
Terraform Version
0.12.11
Affected Resource(s)
- Seems to be a provider issue on version 1.10.0
Terraform Configuration Files
provider "kubernetes" {
version = "1.10"
cluster_ca_certificate = base64decode(
data.google_container_cluster.this.master_auth[0].cluster_ca_certificate,
)
host = data.google_container_cluster.this.endpoint
token = data.external.get_token.result["token"]
load_config_file = false
}
data "google_container_cluster" "this" {
name = var.cluster_name
location = var.cluster_location
project = var.project_id
}
data "external" "get_token" {
program = ["/bin/bash", "${path.module}/get_token.sh"]
query = {
cluster_name = data.google_container_cluster.this.name
cluster_location = var.cluster_location
project_id = var.project_id
credentials = var.credentials
}
}
resource "kubernetes_namespace" "this" {
metadata {
name = var.namespace
}
}
Expected Behavior
Get a success authentication.
Actual Behavior
It fail, getting a "Unauthorized "
Important Factoids
Using version 1.9 there is no problem
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 27
- Comments: 29 (8 by maintainers)
In our builds, we tip toed around this issue by setting the
KUBERNETES_SERVICE_HOSTenvironment variable to an empty string before running terraform.This causes the following check in the kubernetes rest client to think it is not running in a cluster https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L404-L407
For those encountering this issue when running terraform in kubernetes pods, there’s a work-around provided by @iciclespider:
Setting the
KUBERNETES_SERVICE_HOSTto an empty value right before invoking terraform, this breaks the in-cluster detection logic.A fix is being worked on. I believe @iciclespider also nailed the root cause and the solution.
@alexsomesan Yes. Terraform is executed inside a different cluster.
In my case, I am running Jenkins in a private, on premise Kubernetes cluster, and the build jobs are attempting to update an AWS EKS cluster. When the kubernetes rest api is called, the correct host and ca certificate is used, but the token used is the token found in the
/var/run/secrets/kubernetes.io/serviceaccount/tokenfile, even though the kubernetes provider is properly configured with the AWS EKS token.Thanks so much for this @iciclespider !!
I also just ran into this issue and have figured out what I believe is the exact cause.
When the terraform is run in a Kubernetes pod, the Kubernetes configuration is defaulted to be the in cluster configuration, which sets all of these fields in the
cfghttps://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L422-L428It is the
BearerTokenFilefield that was not accounted for in this PR Support in-cluster configuration with service accounts. TheBearerTokenFilevalue should be cleared when there is atokenvalue from the kubernetes provider config here https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/kubernetes/provider.go#L242-L244Instead, it should do:
From https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L74-L77
We see this too. It seems like the smoking gun is this PR. If change is still desired, should this release have been an x-release (2.0)?
There could be a way to reimplemented that change using resource syntax that is additive and thus would only constitute a y release like you have it now.
For now, we are telling our developers to pin to 1.9.0.
So even I am getting the same error. I am getting it with Kubernetes Provider and EKS. I got it for :
The mentioned workaround fixed the issue for me. Happy to see that root cause and fix both are available. Thanks a lot @iciclespider @pdecat
@alexsomesan The problem only occurs when terraform is run on a different Kubernetes cluster than the cluster being updated by the terraform script.
When will we see a new release?
I’m running into Error: Failed to configure: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory issues in my pipeline (bitbucket pipelines) too when using load_config_file = false
With 1.9 it works, so downgraded for now, but something seems very off when telling the provider to not load the config file.
I’m curious to see if the PR merged to solve this issue will also solve mine.
good catch @pdecat and thanks for creating that PR!
Hi @houqp, after more investigation, it is not the single line patch I expected as this would have only corrected the case where a service account token is available at
/var/run/secrets/kubernetes.io/serviceaccount/token(and the symptom isError: Unauthorized), but not the case where no token is mounted in the pod at/var/run/secrets/kubernetes.io/serviceaccount/token(symptomError: Failed to configure: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory).I’ve submitted #686 which should fix both cases. Obviously, feedback is welcome, you may even build the provider yourself to test the fix.
I understand this is causing issues but as there is a work-around, it should be avoided to rush a fix that would only make things worse.
Ran into the exact same issue with v1.10. As @pdecat mentioned above, the fix provided by @iciclespider should fix the root cause. Given that it’s just an one line patch, @alexsomesan could you push a quick commit to have it fixed? Let us know if you think it’s better for any of us to submit a PR for it.
This is affecting all terraform runs within k8s pods that are using token auth, so it’s rather serious. At least it’s breaking all of our Jenkins pipelines.
@caquino Does setting the
KUBERNETES_SERVICE_HOSTenvironment variable to an empty string when running terraform still have the error you are enountering?