rancher: Local cluster stuck on waiting for cluster agent to connect

Rancher Version: 2.1.1 Installation: Single Install

Deployed rancher into an existing cluster and set k8s-mode=embedded and add-local=true. When rancher starts up the local cluster will show “Waiting for API to be available” and the logs will continue to show [ERROR] ClusterController c-xyz [user-controllers-controller] failed with : failed to start user controllers for cluster c-xyz: failed to contact server: Get https://x.x.x.x:443/version?timeout=30s: waiting for cluster agent to connect

Deployment:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: rancher
  namespace: rancher
  labels:
    k8s-app: rancher

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: rancher
  namespace: rancher
  labels:
    k8s-app: rancher
subjects:
- kind: ServiceAccount
  name: rancher
  namespace: rancher
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin

---

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: rancher
  namespace: rancher
  labels:
    k8s-app: rancher
  annotations:
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: rancher
  template:
    metadata:
      name: rancher
      creationTimestamp: 
      labels:
        k8s-app: rancher
    spec:
      serviceAccountName: rancher
      containers:
      - name: rancher
        image: rancher/rancher:v2.1.1
        imagePullPolicy: IfNotPresent
        args:
        - "--k8s-mode"
        - "embedded"
        - "--add-local"
        - "true"
        - "--http-listen-port=80"
        - "--https-listen-port=443"
        - "--no-cacerts"
        volumeMounts:
        - mountPath: "/var/lib/rancher"
          name: data
        ports:
        - containerPort: 80
          protocol: TCP
        - containerPort: 443
          protocol: TCP
        livenessProbe:
          tcpSocket:
            port: 80
          initialDelaySeconds: 60
          periodSeconds: 30
        readinessProbe:
          tcpSocket:
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 30
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: rancher-persistentvolumeclaim

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 17 (2 by maintainers)

Most upvoted comments

waiting for cluster agent to connect indicates the cluster-agent not being able to connect, the logging of the cattle-cluster-agent pod should reveal why it cannot connect.