rancher: Internal Address Setting is not alterable, etcd from private network can't join the cluster

I recently connected a second site to an existing cluster, it was wished that the etcd runs also on one node of the other connected side, since this infrastructure should gradually migrate. The problem: Everything is fine, connection works perfect routing is working, just RKE won’t configure etcd correctly. This should, at least what the docs are saying, be fixable by deploying with the internalAddress specified in the RKE config. However, this is not editable in the rancher ui and I don’t seem to find a way to make it work.

So finally reaching out here, see log below for some insight.

2020-07-24 21:32:58.977617 I | embed: rejected connection from "192.168.1.4:44064" (error "tls: \"192.168.1.4\" does not match any of DNSNames [\"node-1\" \"node-2\" \"localhost\" \"kubernetes\" \"kubernetes.default\" \"kubernetes.default.svc\" \"kubernetes.default.svc.cluster.local\"] (lookup kubernetes.default.svc.cluster.local on 1.1.1.1:53: no such host)", ServerName "", IPAddresses ["pub-ip" "pub-ip2" "127.0.0.1" "10.43.0.1"], DNSNames ["node-1" "node-2" "localhost" "kubernetes" "kubernetes.default" "kubernetes.default.svc" "kubernetes.default.svc.cluster.local"])

Version is currently rancher v2.4.3

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 23 (2 by maintainers)

Most upvoted comments

still an issue

Okay, I had the exact same issue: my original cluster was created on a cloud provider and only had public IP’s. I wanted to move all the nodes behind a firewall, on private network. However, I could not get the new master to join the exisiting etcd cluster. Rancher would mix the external IP and the internal IP (there are numerous issues about that) and the same error would keep popping up:

embed: rejected connection from "10.1.0.106:51728" (error "tls: \"10.1.0.106\" does not match any of DNSNames

I could not get 10.1.0.106 (which is the internal IP of the machine I’d like to get off the public network) in the SSL certificate. I could not set rke.cattle.io/internal-ip and then rotate the certificates (to let them be regenerated).

My solution to this (after trying so much options), was to use iptables routing to rewrite the IP:

iptables -t nat -A OUTPUT -p tcp -d 10.1.0.61 -j DNAT --to-destination EXTERNALIP
iptables -t nat -A PREROUTING -p tcp -d 10.1.0.61 -j DNAT --to-destination EXTERNALIP
iptables -t nat -A POSTROUTING -j MASQUERADE

10.1.0.61 is the IP of the new internal master node EXTERNALIP is an external IP (on the firewall) which I used to NAT (just for the time being).

This makes the request to etcd appear from the original external IP (which is inside the etcd certificate /etc/kubernetes/ssl/kube-etcd-*.pem and allows the certificate to be used and etcd to be migrated! 🎉