kubernetes: [Failing Test] unable to create the kubeadm:cluster-admins ClusterRoleBinding (periodic-kubernetes-e2e-kind-kms)

Which jobs are failing?

master-informing:

  • periodic-kubernetes-e2e-kind-kms

Which tests are failing?

kubetest2.Up

Image

Since when has it been failing?

https://github.com/kubernetes/kubernetes/pull/121305

Testgrid link

https://testgrid.k8s.io/sig-release-master-informing#periodic-kubernetes-e2e-kind-kms

Reason for failure (if possible)

I1029 07:10:01.667112     297 kubeconfig.go:682] creating the ClusterRoleBinding for the kubeadm:cluster-admins Group by using super-admin.conf
I1029 07:10:01.705456     297 round_trippers.go:553] POST https://kms-control-plane:6443/apis/rbac.authorization.k8s.io/v1/clusterrolebindings?timeout=10s 500 Internal Server Error in 38 milliseconds
Internal error occurred: got unexpected nil transformer
unable to create the kubeadm:cluster-admins ClusterRoleBinding
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.EnsureAdminClusterRoleBindingImpl
	cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:691
k8s.io/kubernetes/cmd/kubeadm/app/phases/kubeconfig.EnsureAdminClusterRoleBinding
	cmd/kubeadm/app/phases/kubeconfig/kubeconfig.go:595

Anything else we need to know?

No response

Relevant SIG(s)

/sig cluster-lifecycle

About this issue

  • Original URL
  • State: closed
  • Created 8 months ago
  • Comments: 18 (17 by maintainers)

Most upvoted comments

/assign

but this test job is maintained by sig auth: /sig auth

looks like the API server has decided to consistently fail on an unpolled CRB create API call from the kubeadm client. this did not happen on the kind presubmit jobs here: https://github.com/kubernetes/kubernetes/pull/121305

or kubeadm periodic jobs using the kinder tool: https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm

or my local tests.

Internal error occurred: got unexpected nil transformer

so what is different here and why are we getting informer errors? is this highlighting a bug in core k8s? cc @enj @liggitt

best we can do in kubeadm is make that API call polled. FTR, the API call exactly prior to this failing API call is already polled.

yes, seems to work now 😃 thank you all!!!

PR with an attempt to improve the kubeadm wait logic: https://github.com/kubernetes/kubernetes/pull/121743

@cpanato i do think CAPG has unrelated problems. the kubelet might be failing to start or starting but not creating the static pod for kube-apiserver.

we have a couple of kubeadm passes here: https://github.com/kubernetes/kubernetes/pull/121609

of the presubmit variant of this job: pull-kubernetes-e2e-kind-kms

this seems like a flake and not a failure, given the offending PR merged on the 27th and looking at the history of the periodic: https://testgrid.k8s.io/sig-release-master-informing#periodic-kubernetes-e2e-kind-kms