rancher: Health Check stuck Initializing when used a host that have a connection via VPN to the server where is rancher

Rancher versions: rancher/server: v1.6.10 rancher/agent: v1.2.6

Infrastructure Stack versions: healthcheck: v0.3.3 ipsec: v0.1.4 network-services: v0.2.6 scheduler: v0.6.3

Docker version: Docker version 17.06.2-ce, build cec0b72

Containers: 11 Running: 11 Paused: 0 Stopped: 0 Images: 11 Server Version: 17.06.2-ce Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 60 Dirperm1 Supported: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170 runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2 init version: 949e6fa Security Options: apparmor seccomp Profile: default Kernel Version: 4.4.0-1022-aws Operating System: Ubuntu 16.04.3 LTS OSType: linux Architecture: x86_64 CPUs: 1 Total Memory: 1.952GiB Name: ip-172-31-36-116 ID: 6FWI:DJAB:IRFB:2JXT:EUJO:TDBW:JZHD:IV5A:2ONN:62XH:LHXU:AJNK Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false

This docker info is from one of the hosts.

Operating system and kernel: NAME=“Ubuntu” VERSION=“16.04.3 LTS (Xenial Xerus)” ID=ubuntu ID_LIKE=debian PRETTY_NAME=“Ubuntu 16.04.3 LTS” VERSION_ID=“16.04” HOME_URL=“http://www.ubuntu.com/” SUPPORT_URL=“http://help.ubuntu.com/” BUG_REPORT_URL=“http://bugs.launchpad.net/ubuntu/” VERSION_CODENAME=xenial UBUNTU_CODENAME=xenial

4.4.0-1022-aws

Type/provider of hosts: AWS

Setup details:) Single node rancher, internal DB

Environment Template: Cattle

Steps to Reproduce:

  1. I have a OpenVPN server hosted on AWS (I followed this guide https://www.digitalocean.com/community/tutorials/how-to-set-up-an-openvpn-server-on-ubuntu-16-04?comment=64972). This VPN configuration force all traffic trough the VPN connection.

  2. I have a local server that doesn’t have a public IP and is connected to the OpenVPN Server. This local server have the Rancher Server exposed through some port in order to be accesible via VPN_CLIENT_IP:PORT from others VPN Clients. (When a client is connected to the VPN it adds a tun0 interface with the corresponding IP for the VPN).

  3. I created two instances in AWS, both have his OpenVPN client configuration and can ping to the local server. Also installed docker using this command curl https://releases.rancher.com/install-docker/17.06.sh | sh (Got it from here http://rancher.com/docs/rancher/latest/en/hosts/). Note: My VPN server is in another region and my host are them in the same region in AWS.

  4. Add hosts, I use Elastic IP from AWS so I have to add the hosts of AWS via Custom. This the command that I execute in every host. sudo docker run -e CATTLE_AGENT_IP="HOST_PUBLIC_IP" -e CATTLE_HOST_LABELS='type=cache' --rm --privileged -v /var/run/docker.sock:/var/run/docker.sock -v /var/lib/rancher:/var/lib/rancher rancher/agent:v1.2.6 http://RANCHER_LOCAL_SERVER_VPN_IP:PORT/v1/scripts/TOKEN

Results: Results on the terminal when adding hosts

INFO: Running Agent Registration Process, CATTLE_URL=http://RANCHER_LOCAL_SERVER_VPN_IP:PORT/v1 INFO: Attempting to connect to: http://RANCHER_LOCAL_SERVER_VPN_IP:PORT/v1 INFO: http://RANCHER_LOCAL_SERVER_VPN_IP:PORT/v1 is accessible INFO: Inspecting host capabilities INFO: Boot2Docker: false INFO: Host writable: true INFO: Token: xxxxxxxx INFO: Running registration INFO: Printing Environment INFO: ENV: CATTLE_ACCESS_KEY=ACCESS_KEY INFO: ENV: CATTLE_AGENT_IP=HOST_PUBLIC_IP INFO: ENV: CATTLE_HOME=/var/lib/cattle INFO: ENV: CATTLE_HOST_LABELS=type=cache INFO: ENV: CATTLE_REGISTRATION_ACCESS_KEY=registrationToken INFO: ENV: CATTLE_REGISTRATION_SECRET_KEY=xxxxxxx INFO: ENV: CATTLE_SECRET_KEY=xxxxxxx INFO: ENV: CATTLE_URL=http://RANCHER_LOCAL_SERVER_VPN_IP:PORT/v1 INFO: ENV: DETECTED_CATTLE_AGENT_IP=HOST_VPN_IP INFO: ENV: RANCHER_AGENT_IMAGE=rancher/agent:v1.2.6 INFO: Launched Rancher Agent: SOME_ID

screen shot 2017-10-17 at 3 46 32 pm

screen shot 2017-10-17 at 3 51 35 pm

Log of health check container stucked in Initializing screen shot 2017-10-17 at 3 52 22 pm

And because of that can’t ping between containers and there is no communication between my services that use those hosts.

Any help?

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 2
  • Comments: 16 (3 by maintainers)

Most upvoted comments

Just to notify that the correct way to get this to work was

  • Modifying the MTU in the client-config.ovpn of every host that need to be added and can only reach Rancher through VPN.
  • Adding the host with the public IP specifying it on the CATTLE_AGENT_IP.

Thanks to everyone (@superseb, @leodotcloud, @vincent99) for the help. This issue can now be closed.

@superseb Yes I’m forcing all traffic through the VPN and yes if I use the VPN IPs of the hosts the connection works fine except that I use a host for a load balancer and assign a certificate to it. Said so, my app only will be reachable from inside the VPN (I think the edition of the service use the host IP that will be the private IP of the VPN causing to only be reachable from inside VPN) and I need it reachable from everywhere through the domain that needs to be pointing to the public IP of the host used for the load balancer. I don’t know if I explain well what I’m trying to reach.