rancher: New Issues of Cattle-Cluster-Agent: (Could not resolve host: rancher..com)

What kind of request is this (question/bug/enhancement/feature request): question

Steps to reproduce (least amount of steps as possible):

kubectl logs pod/cattle-cluster-agent-6cc6666fc5-hjvpd -n cattle-system

Result:

INFO: Using resolv.conf: nameserver 10.43.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5 ERROR: https://rancher.nik.com/ping is not accessible (Could not resolve host: rancher.nik.com)

Other details that may be helpful:

Environment information

  • Rancher version : rancher/rancher/v2.1.7
  • Installation option (single install/HA): HA-AirGap

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): Infrastructure Provider
  • Machine type (cloud/VM/metal) and specifications (CPU/memory):VM
  • Kubernetes version (use 1.12.5):
  • Docker version (use docker version): 17.3.2 -Kubectl Config:
kubectl create clusterrolebinding tiller   --clusterrole=cluster-admin   --serviceaccount=kube-system:tiller

kubectl -n cattle-system create secret tls tls-rancher-ingress   --cert=/root/ca/certs/tls.crt   --key=/root/ca/private/tls.key

kubectl -n cattle-system create secret generic tls-ca   --from-file=/root/ca/cacerts.pem

  • Helm Config:
helm init --service-account tiller --tiller-image 172.18.3.9:5000/registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.12.1

helm template ./rancher-2019.3.1.tgz --output-dir .   --name rancher   --namespace cattle-system   --set hostname=rancher.nikafarinegan.ir   --set rancherImage=172.18.3.9:5000/rancher/rancher:v2.1.7  --set privateCA=true   --set ingress.tls.source=secret  --set auditLog.hostPath=/var/log/rancher/audit   --set auditLog.level=3  --set auditLog.destination=hostPath   --set auditLog.maxSize=500   --set noProxy="localhost\,127.0.0.1\,10.0.0.0/8\,172.16.0.0/12\,192.168.0.0/16\,172.17.0.0/16\,192.168.0.0/16\,172.18.3.0/26" --set addLocal=true     

More outputs from container: root@nik17:/var/log/containers# cat ./cattle-node-agent-wjscf_cattle-system_agent-0cc20ec16da8935f1a53116228372c72c79db4dc3ba3a852852e3941ed860637.log

{“log”:“INFO: Environment: CATTLE_ADDRESS=172.18.3.17 CATTLE_AGENT_CONNECT=true CATTLE_CA_CHECKSUM=b630c15354fdddfade789f7763700b098cf859b2f6922927808bfea1c521b992 CATTLE_CLUSTER=false CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=nik17.nik.com CATTLE_SERVER=https://rancher.nik.com\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.145999899Z”} {“log”:“INFO: Using resolv.conf: nameserver 127.0.1.1\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.149868451Z”} {“log”:“WARN: Loopback address found in /etc/resolv.conf, please refer to the documentation how to configure your cluster to resolve DNS properly\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.162969Z”} {“log”:“INFO: https://rancher.nik.com/ping is accessible\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.21055151Z”} {“log”:“INFO: Value from https://rancher.nik.com/v3/settings/cacerts is an x509 certificate\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.302245239Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Rancher agent version v2.1.7 is starting"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648181212Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Option customConfig=map[address:172.18.3.17 internalAddress: roles:[] label:map[]]"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648301145Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Option etcd=false"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648315898Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Option controlPlane=false"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648325692Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Option worker=false"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648334744Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Option requestedHostname=nik17.nik.com"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648343368Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Listening on /tmp/log.sock"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.648352512Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Connecting to wss://rancher.nik.com/v3/connect with token cgdhrcll9bmljsl6zqnnqcrzhq6tfdtlf5bvhtjfkf7tz56w5nxhsm"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.732511756Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Connecting to proxy" url="wss://rancher.nik.com/v3/connect"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.732578639Z”} {“log”:“time="2019-03-12T13:47:09Z" level=info msg="Starting plan monitor"\n”,“stream”:“stderr”,“time”:“2019-03-12T13:47:09.815178156Z”}

For generate Self-Certificate by ‘PRIVATE CA’ i implemented this url ref: https://networklessons.com/uncategorized/openssl-certification-authority-ca-ubuntu-server

I checked status network connections by alpine but there wasn’t problem.

More outputs from /var/log/rancher/audit: attached file and you can find “cattle-cluster-agent” I’m sorry for long contents in file because i used auditLog.destination=hostPath !

log.txt

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 25 (1 by maintainers)

Most upvoted comments

@v15170r

Adding host alias to the cattle agent can be done in this way:

kubectl -n cattle-system patch  deployments cattle-cluster-agent --patch '{
    "spec": {
        "template": {
            "spec": {
                "hostAliases": [
                    {
                      "hostnames":
                      [
                        "{{ rancher_server_hostname }}"
                      ],
                      "ip": "{{ rancher_server_ip }}"
                    }
                ]
            }
        }
    }
}'

kubectl -n cattle-system patch  daemonsets cattle-node-agent --patch '{
 "spec": {
     "template": {
         "spec": {
             "hostAliases": [
                 {
                    "hostnames":
                      [
                        "{{ rancher_server_hostname }}"
                      ],
                    "ip": "{{ rancher_server_ip }}"
                 }
             ]
         }
     }
 }
}'

@armanriazi HI! I solved the problem. cattle-cluster-agent edit networking add Host Aliases (/etc/hosts entries)

@superseb I solved value of server-url/v3/settings/cacerts and intermediate CA but I still getting this error (I saw issues similar but there was’t full solver and I think it not granular nature system yet) Thanks friends. INFO: Environment: CATTLE_ADDRESS=10.42.0.14 CATTLE_CA_CHECKSUM=f4400caeebc0e481b1edfa15e5ba19b9756bc132d0a9448dc82479b718f8cdc9 CATTLE_CLUSTER=true CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-88dbc897d-x9hp7 CATTLE_SERVER=https://rancher.nik.com CATTLE_SERVICE_PORT=tcp://10.43.131.37:80 CATTLE_SERVICE_PORT_443_TCP=tcp://10.43.131.37:443 CATTLE_SERVICE_PORT_443_TCP_ADDR=10.43.131.37 CATTLE_SERVICE_PORT_443_TCP_PORT=443 CATTLE_SERVICE_PORT_443_TCP_PROTO=tcp CATTLE_SERVICE_PORT_80_TCP=tcp://10.43.131.37:80 CATTLE_SERVICE_PORT_80_TCP_ADDR=10.43.131.37 CATTLE_SERVICE_PORT_80_TCP_PORT=80 CATTLE_SERVICE_PORT_80_TCP_PROTO=tcp CATTLE_SERVICE_SERVICE_HOST=10.43.131.37 CATTLE_SERVICE_SERVICE_PORT=80 CATTLE_SERVICE_SERVICE_PORT_HTTP=80 CATTLE_SERVICE_SERVICE_PORT_HTTPS=443

INFO: Using resolv.conf: nameserver 10.43.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5 error : ERROR: https://rancher.nik.com/ping is not accessible (Could not resolve host:)

The cluster-agent uses cluster DNS which uses DNS resolving, not a /etc/hosts workaround. Make sure it can properly resolve the configured server-url. Regarding the certificates, the configured CA cert is retrieved from server-url/v3/settings/cacerts and used to verify the certificate. Make sure it is configured correctly and uses the correct certificate. If this is correct, possibly it needs an intermediate certificate which needs to be added to the server certificate configured to provide the complete chain to the client (agent).

@v15170r 我的英文不好,我是在无法启动的cattle-cluster-agent 容器的网络设置的host Aliases中加入不能访问的域名,这样容器就能启动了

@armanriazi
HI! I have the same problem as you. Have you solved it?

INFO: Environment: CATTLE_ADDRESS=10.42.2.12 CATTLE_CA_CHECKSUM=4721728379bf3c59b576e7392e743c220136aa73d0297b5a0e46487cdec7f01d CATTLE_CLUSTER=true CATTLE_INTERNAL_ADDRESS= CATTLE_K8S_MANAGED=true CATTLE_NODE_NAME=cattle-cluster-agent-5fcf59f8b7-4pbhx CATTLE_SERVER=https://rancher.xxx.com INFO: Using resolv.conf: nameserver 10.43.0.10 search cattle-system.svc.cluster.local svc.cluster.local cluster.local options ndots:5 ERROR: https://rancher.xxx.com/ping is not accessible (Could not resolve host: rancher.xxx.com)

I get the same problem. Looks like this:

kubectl get deployments -n cattle-system
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE
cattle-cluster-agent   0/1     1            0           23h
rancher                3/3     3            3           23h
kubectl -n cattle-system logs -f cattle-cluster-agent-9c64ff5d6-7q5fw
INFO: Environment: CATTLE_ADDRESS=10.42.1.9 
                                  CATTLE_CA_CHECKSUM=baa47<xxx>bd47459f92 
                                  CATTLE_CLUSTER=true 
                                  CATTLE_FEATURES=dashboard=true 
                                  CATTLE_INTERNAL_ADDRESS= 
                                  CATTLE_K8S_MANAGED=true 
                                  CATTLE_NODE_NAME=cattle-cluster-agent-9c64ff5d6-7q5fw 
                                  CATTLE_SERVER=https://stage-k8senv1.xxx.se

INFO: Using resolv.conf: nameserver 10.43.0.10 search cattle-system.svc.stage-k8senv1.xxx.se 
                                  svc.stage-k8senv1.xxx.se 
                                  stage-k8senv1.xxx.se options ndots:5

ERROR: https://stage-k8senv1.xxx.se/ping is not accessible (Could not resolve host: stage-k8senv1.xxx.se)

I’ve applied the patch against deployment rancher

This fully solved the problem.

kubectl -n cattle-system patch  deployments cattle-cluster-agent --patch '{
    "spec": {
        "template": {
            "spec": {
                "hostAliases": [
                    {
                      "hostnames":
                      [
                        "stage-k8senv1.xxx.se"
                      ],
                      "ip": "172.20.10.100"
                    },
                    {
                      "hostnames":
                      [
                        "stage-k8senv1.xxx.se"
                      ],
                      "ip": "172.20.10.101"
                    }

                ]
            }
        }
    }
}'

kubectl -n cattle-system patch  daemonsets cattle-node-agent --patch '{
    "spec": {
        "template": {
            "spec": {
                "hostAliases": [
                    {
                      "hostnames":
                      [
                        "stage-k8senv1.xxx.se"
                      ],
                      "ip": "172.20.10.100"
                    },
                    {
                      "hostnames":
                      [
                        "stage-k8senv1.xxx.se"
                      ],
                      "ip": "172.20.10.101"
                    }

                ]
            }
        }
    }
}'

@superseb Would you be so kind as to explain how to Make sure it can properly resolve the configured server-url, please? I’ve run into the same issue on Minikube over and over again. Adding an entry to /etc/hosts inside Minikube pointing to minikube ip will get the cattle-node-agent pod up and running, but the cattle-cluster-agent keeps failing. FYI Rancher seems to work though, even without that cattle-cluster-agent running.