terraform-provider-kubernetes: Unauthorized after update to v1.10 using token auth

Hi there, After upgrading to version 1.10, token authentication stopped working.

Terraform Version

0.12.11

Affected Resource(s)

Seems to be a provider issue on version 1.10.0

Terraform Configuration Files

provider "kubernetes" {
  version = "1.10"
  cluster_ca_certificate = base64decode(
    data.google_container_cluster.this.master_auth[0].cluster_ca_certificate,
  )
  host             = data.google_container_cluster.this.endpoint
  token            = data.external.get_token.result["token"]
  load_config_file = false
}

data "google_container_cluster" "this" {
  name     = var.cluster_name
  location = var.cluster_location
  project  = var.project_id
}

data "external" "get_token" {
  program = ["/bin/bash", "${path.module}/get_token.sh"]
  query = {
    cluster_name     = data.google_container_cluster.this.name
    cluster_location = var.cluster_location
    project_id       = var.project_id
    credentials      = var.credentials
  }
}

resource "kubernetes_namespace" "this" {
  metadata {
    name = var.namespace
  }
}

Expected Behavior

Get a success authentication.

Actual Behavior

It fail, getting a "Unauthorized "

Important Factoids

Using version 1.9 there is no problem

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 27
Comments: 29 (8 by maintainers)

Commits related to this issue

Downgraded k8s provider, bug in version 1.11 Workaround for issue https://github.com/terraform-providers/terraform-provider-kubernetes/issues/679 — committed to fbdo/gke-sql-terraform by fbdo 4 years ago

Most upvoted comments

In our builds, we tip toed around this issue by setting the KUBERNETES_SERVICE_HOST environment variable to an empty string before running terraform.

$ export KUBERNETES_SERVICE_HOST=
$ terraform init
$ terraform plan

This causes the following check in the kubernetes rest client to think it is not running in a cluster https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L404-L407

	host, port := os.Getenv("KUBERNETES_SERVICE_HOST"), os.Getenv("KUBERNETES_SERVICE_PORT")
	if len(host) == 0 || len(port) == 0 {
		return nil, ErrNotInCluster
	}

+11

iciclespider on Nov 10, 2019

For those encountering this issue when running terraform in kubernetes pods, there’s a work-around provided by @iciclespider:

Setting the KUBERNETES_SERVICE_HOST to an empty value right before invoking terraform, this breaks the in-cluster detection logic.

A fix is being worked on. I believe @iciclespider also nailed the root cause and the solution.

pdecat on Nov 12, 2019

@iciclespider Do you mean “inside a different cluster” ?

@alexsomesan Yes. Terraform is executed inside a different cluster.

In my case, I am running Jenkins in a private, on premise Kubernetes cluster, and the build jobs are attempting to update an AWS EKS cluster. When the kubernetes rest api is called, the correct host and ca certificate is used, but the token used is the token found in the /var/run/secrets/kubernetes.io/serviceaccount/token file, even though the kubernetes provider is properly configured with the AWS EKS token.

iciclespider on Nov 10, 2019

Thanks so much for this @iciclespider !!

mtranter on Dec 23, 2019

I also just ran into this issue and have figured out what I believe is the exact cause.

When the terraform is run in a Kubernetes pod, the Kubernetes configuration is defaulted to be the in cluster configuration, which sets all of these fields in the cfg https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L422-L428

	return &Config{
		// TODO: switch to using cluster DNS.
		Host:            "https://" + net.JoinHostPort(host, port),
		TLSClientConfig: tlsClientConfig,
		BearerToken:     string(token),
		BearerTokenFile: tokenFile,
	}, nil

It is the BearerTokenFile field that was not accounted for in this PR Support in-cluster configuration with service accounts. The BearerTokenFile value should be cleared when there is a token value from the kubernetes provider config here https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/kubernetes/provider.go#L242-L244

	if v, ok := d.GetOk("token"); ok {
		cfg.BearerToken = v.(string)
	}

Instead, it should do:

	if v, ok := d.GetOk("token"); ok {
		cfg.BearerToken = v.(string)
		cfg.BearerTokenFile = nil
	}

From https://github.com/terraform-providers/terraform-provider-kubernetes/blob/master/vendor/k8s.io/client-go/rest/config.go#L74-L77

	// Path to a file containing a BearerToken.
	// If set, the contents are periodically read.
	// The last successfully read value takes precedence over BearerToken.
	BearerTokenFile string

iciclespider on Nov 10, 2019

We see this too. It seems like the smoking gun is this PR. If change is still desired, should this release have been an x-release (2.0)?

There could be a way to reimplemented that change using resource syntax that is additive and thus would only constitute a y release like you have it now.

For now, we are telling our developers to pin to 1.9.0.

joshk0 on Nov 9, 2019

So even I am getting the same error. I am getting it with Kubernetes Provider and EKS. I got it for :

Terraform pod running in the same Kubernetes Cluster
Terraform pod running in different kubernetes cluster

The mentioned workaround fixed the issue for me. Happy to see that root cause and fix both are available. Thanks a lot @iciclespider @pdecat

Krishna1408 on Jan 3, 2020

@alexsomesan The problem only occurs when terraform is run on a different Kubernetes cluster than the cluster being updated by the terraform script.

iciclespider on Nov 10, 2019

When will we see a new release?

I’m running into Error: Failed to configure: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory issues in my pipeline (bitbucket pipelines) too when using load_config_file = false

With 1.9 it works, so downgraded for now, but something seems very off when telling the provider to not load the config file.

I’m curious to see if the PR merged to solve this issue will also solve mine.

sleenen on Jan 30, 2020

good catch @pdecat and thanks for creating that PR!

houqp on Nov 14, 2019

Hi @houqp, after more investigation, it is not the single line patch I expected as this would have only corrected the case where a service account token is available at /var/run/secrets/kubernetes.io/serviceaccount/token (and the symptom is Error: Unauthorized), but not the case where no token is mounted in the pod at /var/run/secrets/kubernetes.io/serviceaccount/token (symptom Error: Failed to configure: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory).

I’ve submitted #686 which should fix both cases. Obviously, feedback is welcome, you may even build the provider yourself to test the fix.

I understand this is causing issues but as there is a work-around, it should be avoided to rush a fix that would only make things worse.

pdecat on Nov 13, 2019

Ran into the exact same issue with v1.10. As @pdecat mentioned above, the fix provided by @iciclespider should fix the root cause. Given that it’s just an one line patch, @alexsomesan could you push a quick commit to have it fixed? Let us know if you think it’s better for any of us to submit a PR for it.

This is affecting all terraform runs within k8s pods that are using token auth, so it’s rather serious. At least it’s breaking all of our Jenkins pipelines.

houqp on Nov 13, 2019

@caquino Does setting the KUBERNETES_SERVICE_HOST environment variable to an empty string when running terraform still have the error you are enountering?

iciclespider on Nov 10, 2019