k9s: K9s can't connect to cluster in logs but curl to cluster endpoint works




Describe the bug When opening up k9s to connect to a cluster, it fails with Boom!! K9s can't connect to cluster.

Logs show that a GET request to the cluster version endpoint timed out:

11:16PM INF đŸ¶ K9s starting up...
11:16PM ERR K9s can't connect to cluster error="Get \"https://xxxxxxxxxxxxx.gr1.us-west-2.eks.amazonaws.com/version?timeout=5s\": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"
11:16PM PNC K9s can't connect to cluster
Log file created at: 2020/11/13 23:16:49
Running on machine: MyMachine
Binary: Built with gc go1.15.3 for darwin/amd64
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
11:16PM ERR Boom! K9s can't connect to cluster
11:16PM ERR goroutine 1 [running]:
runtime/debug.Stack(0x3f27660, 0x2d5f303, 0x0)
	runtime/debug/stack.go:24 +0x9f
github.com/derailed/k9s/cmd.run.func1()
	github.com/derailed/k9s/cmd/root.go:75 +0x125
panic(0x2a01b00, 0xc000376460)
	runtime/panic.go:969 +0x1b9
github.com/rs/zerolog.(*Logger).Panic.func1(0xc000b82800, 0x1c)
	github.com/rs/zerolog@v1.18.0/log.go:338 +0x4f
github.com/rs/zerolog.(*Event).msg(0xc00082ee40, 0xc000b82800, 0x1c)
	github.com/rs/zerolog@v1.18.0/event.go:146 +0x202
github.com/rs/zerolog.(*Event).Msgf(0xc00082ee40, 0x2d835b0, 0x1c, 0x0, 0x0, 0x0)
	github.com/rs/zerolog@v1.18.0/event.go:126 +0x87
github.com/derailed/k9s/cmd.loadConfiguration(0xc000abfbc8)
	github.com/derailed/k9s/cmd/root.go:130 +0x733
github.com/derailed/k9s/cmd.run(0x3f07620, 0xc0004694c0, 0x0, 0x2)
	github.com/derailed/k9s/cmd/root.go:83 +0x8d
github.com/spf13/cobra.(*Command).execute(0x3f07620, 0xc00004c0d0, 0x2, 0x2, 0x3f07620, 0xc00004c0d0)
	github.com/spf13/cobra@v1.0.0/command.go:846 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0x3f07620, 0x0, 0x0, 0x0)
	github.com/spf13/cobra@v1.0.0/command.go:950 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.0.0/command.go:887
github.com/derailed/k9s/cmd.Execute()
	github.com/derailed/k9s/cmd/root.go:66 +0x2d
main.main()
	github.com/derailed/k9s/main.go:28 +0x1f8

To make sure my machine is able to obtain connectivity with that endpoint:

$ curl -k https://xxxxxxxxxxxxx.gr1.us-west-2.eks.amazonaws.com/version?timeout=5s
{
  "major": "1",
  "minor": "16+",
  "gitVersion": "v1.16.13-eks-2ba888",
  "gitCommit": "2ba888155c7f8093a1bc06e3336333fbdb27b3da",
  "gitTreeState": "clean",
  "buildDate": "2020-07-17T18:48:53Z",
  "goVersion": "go1.13.9",
  "compiler": "gc",
  "platform": "linux/amd64"
}

To Reproduce Steps to reproduce the behavior:

  1. Try to connect to your EKS cluster with k9s

Expected behavior Connect to cluster and k9s opens

Screenshots Screen Shot 2020-11-13 at 11 27 39 PM

Versions (please complete the following information):

  • OS: MacOS
  • K9s:
Version:    v0.23.10
Commit:     a952806ebaa316e2c7d0949ad605fb4c944f2cd0
Date:       2020-11-10T15:21:22Z
  • K8s: 1.16 on EKS

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 31 (5 by maintainers)

Most upvoted comments

I’m seeing this same “issue” against an EKS cluster. kubectl works fine. K9s is timing out. If I set --request-timeout=20s than it works, albeit slowly.

For those who are experiencing this problem with Google GKE clusters after updating to use the gke-gcloud-auth-plugin, I found the solution to the problem. It looks like the kube config information has to change for k9s to be able to connect to the contexts properly. For some reason kubectl works just fine but k9s couldn’t talk to the cluster.

This is the change I made and my context was able to work again - edit (on Mac) ~/.kube/config

users:
- name: [GKE_CLUSTER_NAME] # Needs to match cluster name from "clusters" section in yaml
  # THIS WORKS
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      args: null
      command: gke-gcloud-auth-plugin
      env: null
      installHint: Install gke-gcloud-auth-plugin for use with kubectl by following
        https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
      interactiveMode: IfAvailable
      provideClusterInfo: true

  # OLD FORMAT PRIOR TO gke-gcloud-auth-plugin - NO LONGER WORKS IN K9S
  # user:
  #   auth-provider:
  #     config:
  #       access-token: !QTQGFAKFADFK@#Q$L#@KJ
  #       cmd-args: config config-helper --format=json
  #       cmd-path: /Users/username/google-cloud-sdk/bin/gcloud
  #       expiry: "2023-02-03T14:45:31Z"
  #       expiry-key: '{.credential.token_expiry}'
  #       token-key: '{.credential.access_token}'
  #     name: gcp

As you can see above, the new gke-gcloud-auth-plugin approach is much more generic than it used to be. If you have multiple contexts you should be able to just copy and paste the user: block above and put it under each one of the - name: cluster configs.

This got my k9s back into a good state for me.

Hey guys, I had a similar problem and accidentally found a cause and a solution for my case.

So, I am using oidc-login to connect to cluster, I also have multiple kubeconfig files for different contexts. The issue was that I had the same username but different credentials (client id/secret) for each context. This is likely to cause unexpected effects, especially when merging kubeconfig. Once I started using a unique username, k9s works perfectly, also combined with kubectx and kubens.

Here is a user name field in kubeconfig I am talking about:

apiVersion: v1
clusters: ...
contexts: ...
users:
- name: my-user
  user: ...

My problem was same with @kurtextrem.

This is k9s log

5:46PM INF đŸ¶ K9s starting up...
5:46PM WRN Unable to dial discovery API error="exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""
5:46PM ERR Fail to locate metrics-server error="exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""
5:46PM ERR failed to connect to cluster error="exec plugin: invalid apiVersion \"client.authentication.k8s.io/v1alpha1\""

It seems that the new version doesn’t allow apiVersion v1alpha1. So I changed its apiVersion in the kubeconfig to v1beta1, but it still doesn’t work. After I upgraded my aws-cli version to the latest, It works well. So this problem for EKS seems to be a combination of k9s deprecating v1alpha1 and the old aws-cli which doesn’t support v1beta1

I just had a similar problem where setting KUBECONFIG to multiple kubeconfig paths and kubectl worked, but k9s didn’t work for all but one.

turns out I had a collision with the user name in my configs and the error from the k9s logs weren’t obvious that it was an auth failure.

fixing that collision fixed my issue.

In my case it was an outdated aws cli that caused the error (kube config did not update in that case). So if anyone is coming to this issue from Google, this might be another thing to take a look at.

For those who might end up here for a similar reason:

Similar issue/behavior when running the installation from Snap on Ubuntu 22.04 against GKE (Google), with their gke-gcloud-auth-plugin plugin.

users:
- name: gke_my-cluster_europe-west1_example
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: gke-gcloud-auth-plugin

In this scenario both kubectl/kubectx work as expected.

When installing from LinuxBrew (brew install derailed/k9s/k9s) k9s worked just fine (with same .kube/config).

Running:

Version:    v0.25.18
Commit:     6085039f83cd5e8528c898cc1538f5b3287ce117
Date:       2021-12-28T16:53:21Z

@dgoradia Hum
 Can you actually connect to this cluster with kubectl? ie what does kubectl get no yields? curl -k turns off certs and just validates the endpoint. Both kubectl and k9s use certs to connect to the api server. So wild guess here but your creds are not quiet setup here to connect to aws??