hcloud-cloud-controller-manager: Cloud-Controller w/network (native routing) does not create correct routes

Hello,

I’ve been playing around with kubernetes 1.19 on hcloud for a bit now. Since the documentation about this is pretty old, I’ve been mostly trying to figure it on my own.

So my current setup: 1x Network / 10.0.0.0/8 1x LB (for a later HA setup of the control-planes, 10.0.0.5 here) 1x CPX11 (control-plane) 2x CPX11 (worker nodes)

Using kubeadm to setup the kubernetes cluster:

kubeadm init --ignore-preflight-errors=NumCPU --apiserver-cert-extra-sans $API_SERVER_CERT_EXTRA_SANS --control-plane-endpoint "$CONTROL_PLANE_LB" \
  --upload-certs --kubernetes-version=$KUBE_VERSION --pod-network-cidr=$POD_NETWORK_CIDR

with the following variables: API_SERVER_CERT_EXTRA_SANS=10.0.0.1 CONTROL_PLANE_LB=10.0.0.5 KUBE_VERSION=v1.19.0 POD_NETWORK_CIDR=10.224.0.0/16

After that I copy the kube config and create the secrets for the hetzner ccm like this:

apiVersion: v1
kind: Secret
metadata:
  name: hcloud
  namespace: kube-system
stringData:
  token: "<hetzner_api_token>"
  network: "<hetzner_network_id>"
---
apiVersion: v1
kind: Secret
metadata:
  name: hcloud-csi
  namespace: kube-system
stringData:
  token: "<hetzner_api_token>"

Followed by that I deploy the CCM-network:

kubectl apply -f https://raw.githubusercontent.com/hetznercloud/hcloud-cloud-controller-manager/master/deploy/ccm-networks.yaml

The cloud controller goes ready, the nodes do have the hcloud://serverid in their describe.

Now I deploy the latest cilium with a few tweaked parameters:

wget https://raw.githubusercontent.com/cilium/cilium/1.9.0/install/kubernetes/quick-install.yaml

Edit the quick-install.yml and ensure the following parameters:

tunnel: disabled
masquerade: "true"
enable-endpoint-routes: "true"
native-routing-cidr: "10.0.0.0/8"
cluster-pool-ipv4-cidr: "10.224.0.0/16"

Apply the deployment file.

Now the CNI is installed, coredns should start scheduling and the CCM creates routes for the nodes. So far so good, yet the created routes seem to be wrong for me.

grafik

Seeing here:

10.224.0.0/24 routes to 10.0.0.2 (master-01)
10.224.1.0/24 routes to 10.0.0.3 (worker-01)
10.224.2.0/24 routes to 10.0.0.4 (worker-02)

Yet the kubectl get pods -A -owide shows different ip distribution:

root@test-cluster-master-01:~# k get pods -A -owide
NAMESPACE     NAME                                              READY   STATUS    RESTARTS   AGE   IP               NODE                     NOMINATED NODE   READINESS GATES
kube-system   cilium-dkwbl                                      1/1     Running   0          46m   10.0.0.3         test-cluster-worker-01   <none>           <none>
kube-system   cilium-g7whv                                      1/1     Running   0          46m   10.0.0.2         test-cluster-master-01   <none>           <none>
kube-system   cilium-k4tww                                      1/1     Running   0          46m   10.0.0.4         test-cluster-worker-02   <none>           <none>
kube-system   coredns-f9fd979d6-6l8nx                           0/1     Running   0          48m   10.224.0.101     test-cluster-worker-01   <none>           <none>
kube-system   coredns-f9fd979d6-q7dz4                           0/1     Running   0          48m   10.224.1.157     test-cluster-worker-02   <none>           <none>

Where you can see:

10.224.0.101 is scheduled on test-cluster-worker-01 which, according to the routes in cloud console, should have 10.224.1.0/24.
10.224.1.157 is scheduled on test-cluster-worker-02 which should have 10.224.2.0/24

Can someone please pinpoint me into the correct direction for resolving this issue?

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 40 (12 by maintainers)

Most upvoted comments

@philipp1992 @kiwinesian Native routing saves one layer of tunneling/vxlan. You should probably know why that is an advantage.

ByteAlex on Sep 1, 2021

@kiwinesian Created a project for this https://github.com/mysticaltech/kube-hetzner, all works well, including full kube-proxy replacement. However, even though everything was setup with cilium for native routing, I had to use tunnel: geneve see https://github.com/mysticaltech/kube-hetzner/blob/master/manifests/helm/cilium/values.yaml, to make everything really stable, somehow, pure native routing did not make the hetzner csi happy (maybe more debug is needed in the future). The geneve tunnel overhead is really low.

So thanks to cilium in combination with Fedora, we now have full BPF support, and full kube-proxy replacement with the improvement that it brings.

mysticaltech on Jul 30, 2021

If you’re still curious, I solved it too.

This is my working cilium-file with cilium 1.9.5: https://github.com/nupplaphil/hcloud-k8s/blob/stable/roles/kube-master/files/cilium.yaml

But you have to keep an eye on your CIDRs at other places too (as @AlexMe99 already said), like https://github.com/nupplaphil/hcloud-k8s/blob/stable/roles/kube-master/files/hcloud-controller.yaml https://github.com/nupplaphil/hcloud-k8s/blob/f8ee5f18319ad3957a052603c84c4627d23a14e1/roles/kube-master/tasks/tasks.yaml#L6

nupplaphil on Mar 31, 2021

You can basically do these different flavors:

No Hetzner Networks, just a networking plugin which does routing and tunneling using the public interface
Hetzner Networks with a network plugin, which does use the internal network for communication, but does it’s own routing (of pod subnets) in its own tunnel
Hetzner Networks with a network plugin, which uses the internal network for communication as well as tunneling

I’m doing the third variant. You can to that with different plugins, e.g. cilium (native-routing-cidr), flannel (backend type “alloc”), cilium (no IPIP, no vxlan).

MatthiasLohr on Nov 11, 2020

ohh thanks for sharing @mysticaltech ! I might find a weekend to spin the cluster up using your configuration 😉

I just managed to get the k8s cluster going, but there are a few things that I would like to validate with you and see if it makes sense/ ideal:

Can’t use Cilium ipam Native-Routing. Instead, I have to set ipam=kubernetes
Also had the same problem with tunnel=disabled and setting nativeRoutingCIDR=x.x.x.x/8. I ended up leaving it enabled and seems to keeping it happy - even though the reference said to do so https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/master/docs/deploy_with_networks.md.
Ubuntu 20.04 is compatible with kernel 5.11 - but can’t go any higher than 5.11.0 or else it will break due to dependency.

I’m wondering if you are aware the impact of #1 and #2 leaving it as it is? should I attempt with tunnel=geneve?

kiwinesian on Aug 6, 2021

Hi @mysticaltech ,

Yes, for sure! Out of curiosity, have look into Calico?

Looks like they also have eBPF in the latest release.

kiwinesian on Jul 3, 2021

hi @mysticaltech

thanks so much for attending this! interesting part is that Ubuntu 18.04 works completely fine with the iptables, but the csi-provisioner is crashing in Ubuntu 20.04 (and Debian 10) using the same exact config for cilium.yaml

I can try the Fedora 34 one and see if I can get it going. Will have to rewrite the Ansible script to deploy all of this - will report back maybe after the weekend. 😃

kiwinesian on Jul 2, 2021

@AlexMe99

thanks for the info. could you share the exact settings ? how did you create the hcloud network, which arguments did you pass to k3s master and worker and what cilium deployment did you use?

kind regards Philipp

philipp1992 on Mar 12, 2021

@ByteAlex I went through this topic when I wanted to init k8s cluster (v.1.20.0) on hetzner-cloud with cilium (1.9.4). I faced similar issues. What solved them for me was (1) taking care of setting the appropriate --node-ip (internal network ip) on each node (master+worker) for the kubelet as start argument (via kubelet.service.d kubelet-extra-arg) and (2) creating a subset of the general hetzner network for the subnet and another subset for the pod- and service-network. Sth like this: Network: 10.0.0.0/8; SUbnet: 10.1.0.0/16; Pod-Net: 10.2.0.0/16; Srv-Net: 10.3.0.0/16. Especially the appropriate setting of the networks was important. I’m not a networker but the masquerading seems to kill each attempt of seperating the subnet and the pod-/service-net into different domains (like 192… or similer).

AlexMe99 on Feb 24, 2021