rancher: Intermittent error seen while cluster is provisioning and provisioning is stalled for long time before it gets to active state.
What kind of request is this (question/bug/enhancement/feature request): bug
Steps to reproduce (least amount of steps as possible):
- Deploy a 5 node custom cluster - 1 etcd/control and 4 worker nodes.
- Two intermittent errors are seen, after which the cluster comes up successfully
and
Cluster health check failed: Failed to communicate with API server: Get "https://52.15.106.195<>:6443/api/v1/namespaces/kube-system?timeout=45s": dial tcp 127.0.0.1:6443: connect: connection refused; Error while applying agent YAML, it will be retried automatically: exit status 1, Error from server (Forbidden): error when retrieving current configuration of: Resource: "rbac.authorization.k8s.io/v1, Resource=clusterroles", GroupVersionKind: "rbac.authorization.k8s.io/v1, Kind=ClusterRole" N
Note: there will NOT always be reproducible. It was stuck for about 5-10 minutes and then recovered
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI): master-head - commit id:4911f8b116eb
- Installation option (single install/HA): HA
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported): custom
- Kubernetes version (use
kubectl version
):
1.16
gz#15890 gz#16321
gz#16913
gz#17175
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 10
- Comments: 26 (9 by maintainers)
I thought rancher will be an easier way to install k8s…
Hello, downstream cluster remains stuck in Provisioning state. Please assist. Thanks.
Rancher v2.7.4 Downstream : RKE1, K8s v1.25.9
Error while applying agent YAML, it will be retried automatically: exit status 1, Error from server (Forbidden): error when retrieving current configuration of: Resource: “rbac.authorization.k8s.io/v1, Resource=clusterroles”, GroupVersionKind: “rbac.authorization.k8s.io/v1, Kind=ClusterRole” Name: “proxy-clusterrole-kubeapiserver”, Namespace: “” from server for: “./management-statefile_path_redacted”: clusterroles.rbac.authorization.k8s.io “proxy-clusterrole-kubeapiserver” is forbidden: User “u-pl3h4p7xtj” cannot get resource “clusterroles” in API group “rbac.authorization.k8s.io” at the cluster scope Error from server (Forbidden): error when retrieving current configuration of: Resource: “rbac.authorization.k8s.io/v1, Resource=clusterrolebindings”, GroupVersionKind: “rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding” Name: “proxy-role-binding-kubernetes-master”, Namespace: “” from server for: “./management-statefile_path_redacted”: clusterrolebindings.rbac.authorization.k8s.io “proxy-role-binding-kubernetes-master” is forbidden: User “u-pl3h4p7xtj” cannot get resource “clusterrolebindings” in API group “rbac.authorization.k8s.io” at the cluster scope Error from server (Forbidden): error when retrieving current configuration of: Resource: “/v1, Resource=namespaces”, GroupVersionKind: “/v1, Kind=Namespace” Name: “cattle-system”, Namespace: “” from server for: “./management-statefile_path_redacted”: namespaces “cattle-system” is forbidden: User “u-pl3h4p7xtj” cannot get resource “namespaces” in API group “” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “/v1, Resource=serviceaccounts”, GroupVersionKind: “/v1, Kind=ServiceAccount” Name: “cattle”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: serviceaccounts “cattle” is forbidden: User “u-pl3h4p7xtj” cannot get resource “serviceaccounts” in API group “” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “rbac.authorization.k8s.io/v1, Resource=clusterrolebindings”, GroupVersionKind: “rbac.authorization.k8s.io/v1, Kind=ClusterRoleBinding” Name: “cattle-admin-binding”, Namespace: “” from server for: “./management-statefile_path_redacted”: clusterrolebindings.rbac.authorization.k8s.io “cattle-admin-binding” is forbidden: User “u-pl3h4p7xtj” cannot get resource “clusterrolebindings” in API group “rbac.authorization.k8s.io” at the cluster scope Error from server (Forbidden): error when retrieving current configuration of: Resource: “/v1, Resource=secrets”, GroupVersionKind: “/v1, Kind=Secret” Name: “cattle-credentials-df6de44”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: secrets “cattle-credentials-df6de44” is forbidden: User “u-pl3h4p7xtj” cannot get resource “secrets” in API group “” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “rbac.authorization.k8s.io/v1, Resource=clusterroles”, GroupVersionKind: “rbac.authorization.k8s.io/v1, Kind=ClusterRole” Name: “cattle-admin”, Namespace: “” from server for: “./management-statefile_path_redacted”: clusterroles.rbac.authorization.k8s.io “cattle-admin” is forbidden: User “u-pl3h4p7xtj” cannot get resource “clusterroles” in API group “rbac.authorization.k8s.io” at the cluster scope Error from server (Forbidden): error when retrieving current configuration of: Resource: “apps/v1, Resource=deployments”, GroupVersionKind: “apps/v1, Kind=Deployment” Name: “cattle-cluster-agent”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: deployments.apps “cattle-cluster-agent” is forbidden: User “u-pl3h4p7xtj” cannot get resource “deployments” in API group “apps” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “apps/v1, Resource=daemonsets”, GroupVersionKind: “apps/v1, Kind=DaemonSet” Name: “cattle-node-agent”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: daemonsets.apps “cattle-node-agent” is forbidden: User “u-pl3h4p7xtj” cannot get resource “daemonsets” in API group “apps” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “apps/v1, Resource=daemonsets”, GroupVersionKind: “apps/v1, Kind=DaemonSet” Name: “kube-api-auth”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: daemonsets.apps “kube-api-auth” is forbidden: User “u-pl3h4p7xtj” cannot get resource “daemonsets” in API group “apps” in the namespace “cattle-system” Error from server (Forbidden): error when retrieving current configuration of: Resource: “/v1, Resource=services”, GroupVersionKind: “/v1, Kind=Service” Name: “cattle-cluster-agent”, Namespace: “cattle-system” from server for: “./management-statefile_path_redacted”: services “cattle-cluster-agent” is forbidden: User “u-pl3h4p7xtj” cannot get resource “services” in API group “” in the namespace “cattle-system”
Seeing the same issue with Rancher on Exoscale. It can take anywhere between 10 to 60 minutes to recover, sometimes it doesn’t. The deployment is fully automated (using the terraform’s rancher driver) so there is no difference or manual changes between attempts.
rancher | v2.4.5 User Interface | v2.4.28 Helm | v2.16.8-rancher1 Machine | v0.15.0-rancher43
Update: seeing the same now on Azure & AWS. It does not appear to be a vendor issue but a Rancher (etcd/controld?) startup issue.
This error happens basically on every deployment (9 out of 10 are faulty).
Provider: Amazon EC2 K8s Version: v1.18.15 Rancher Version: v2.4.8 1 cp/etcd node + 1 worker node (t2.large each)
Is there any workaround possible yet?
I am consistently running into this issue when deploying a custom cluster. I can confirm that it does take quite a few minutes but eventually will recover and the new cluster will become active. Rancher version is 2.4.8 and the K8 version of the cluster is 1.17.4. Rancher runs on 3 nodes on AWS spread among 3 AZ’s and downstream masters are the same configuration.