fleet: fleet-agent cannot find secret in local cluster in Rancher single-install setup

run rancher:master-0f691dc70f86bbda3d6563af11779300a6191584-head in the single-install mode

Screen Shot 2020-09-14 at 5 17 43 PM Screen Shot 2020-09-14 at 5 17 51 PM Screen Shot 2020-09-14 at 5 18 30 PM

The following line floods the log of the pod fleet-agent-7dfdfd5846-xjw96

time="2020-09-15T00:18:30Z" level=info msg="Waiting for secret fleet-clusters-system/c-09ea1d541bf704218ec6fc9ab2d60c0392543af636c1c3a90793946522685 for request-2vz49: secrets \"c-09ea1d541bf704218ec6fc9ab2d60c0392543af636c1c3a90793946522685\" not found"

gz#14319

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 30 (11 by maintainers)

Most upvoted comments

I finally managed to find a workaround for it:

  • At first create a new Token under API & Keys (note down your Bearer Token)
  • Copy your Kubeconfig from Rancher that is generated for you
  • Change the “token” value
  • Create a new Secret under “fleet-default” namespace in the “local” cluster (If you can’t see it directly, go to namespaces in the Sytem project and then move the “fleet-default” namespace into the System Project). The Secret should be called as given by the error: “secrets “c-xyz123-kubeconfig” not found”, so it should be called “c-xyz123-kubeconfig” the key is “value” and the value is the modified Kubeconfig

Note: you can directly use your generated Kubeconfig, but as it is shared with others, it is safer to create a new one, that can be revoked in case…

Now, fleet-controller should have the correct secret and start deploying fleet-agent on the cluster “c-xyz123”.

But there is definitely a Bug, that prevents creation of this secret. For us, 2 out of 4 clusters were automatically imported, while the others were not.

The main difference is, that the non working clusters were created a long time ago (shortly after Rancher 2 release).

The reason is absence of authn.management.cattle.io/kind=agent label on agent tokens for old clusters. So execute on local cluster kubectl label tokens agent-${agent user name} 'authn.management.cattle.io/kind=agent' and wait for rancher-operator to complete your cluster configuration (may be around 20 minutes). Agent user name can be found in kubectl get users -o custom-columns=NAME:.metadata.name,PrincipalIDs:.principalIds by pricipal system://${cluster name}.

see:

@gofrolist Yeah this has something to do with the kubeconfig that we generated for fleet to access fleet management plane. I suspect that your custom ca-certs has a bit blob of data which needs to be fit into secrets. We encode last applied spec into annotation and that’s probably why it is breaking the limit.

I tested with minified cacerts.pem file where I put only required certs for our infrastructure and it works!

But I was initially confused because ca-bundle it’s kind of default which comes with a package ca-certificates and where is no such limitation mentioned in documentation.

# repoquery -l ca-certificates | grep ca-bundle.crt
/etc/pki/tls/certs/ca-bundle.crt