calico: Calico in ebpf + vxlan mode (and wireguard) failed to be accessed from outside the cluster

Expected Behavior

Service of type LoadBalancer or NodePort can be accessed normally, whatever the backend pod location and source IP is preserved

Current Behavior

Service of type LoadBalancer or NodePort failed to be accessed from outside the cluster:

  • From inside the service works on each node 100% of the time (wherever the node contains a backend pod or not)
  • From outside the service see many connections timeout, and can be reached (sometime) if a backend pod is located on the accessed node

Steps to Reproduce (for bugs)

  • Install kubernetes
  • Install keepalived to move a VIP
  • Install calico (see manifest and config below)
  • Install metallb (only controller, see config below)
  • Install a service of type LoadBalancer with backend pods (in my case nginx ingress controller)
  • curl http://<public_VIP> multiple time. See many errors and only success if node that hold VIP has a backend pod.
  • curl http://<node_IP>:<node_port> we can see the same behaviour.

Context

Each node has two interfaces (except the ones from calico):

  • eth0:
    • PublicIP from provider
    • could receive VIP
  • wg1 (setup manually wia ansible, not calico) :
    • privateIP that mesh all the cluster (kubelet nodeIP)

Felix Config

apiVersion: projectcalico.org/v3
kind: FelixConfiguration
metadata:
  name: default
spec:
  bpfEnabled: true
  bpfExternalServiceMode: Tunnel
  bpfLogLevel: Debug
  ipipEnabled: false
  logSeverityScreen: Info
  reportingInterval: 0s
  vxlanEnabled: true
  vxlanMTU: 1370

Default and sing IPPool:

apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
  name: default-ipv4-ippool
spec:
  blockSize: 26
  cidr: 10.1.128.0/17
  ipipMode: Never
  natOutgoing: true
  nodeSelector: all()
  vxlanMode: Always

Manifest (Diff from calico-vxlan.yaml)

--- arch-cloud/roles/kube-calico/calico-vxlan.yaml	2021-08-02 22:25:06.359634474 +0200
+++ calico-vxlan.yaml	2021-08-06 23:39:51.393404973 +0200
@@ -15,7 +15,7 @@
   # Configure the MTU to use for workload interfaces and tunnels.
   # By default, MTU is auto-detected, and explicitly setting this field should not be required.
   # You can override auto-detection by providing a non-zero value.
-  veth_mtu: "0"
+  veth_mtu: "1370"
 
   # The CNI network configuration to install on each node. The special
   # values in this config will be automatically populated.
@@ -3879,14 +3879,11 @@
                 configMapKeyRef:
                   name: calico-config
                   key: veth_mtu
-            # Disable AWS source-destination check on nodes.
-            - name: FELIX_AWSSRCDSTCHECK
-              value: Disable
             # The default IPv4 pool to create on startup if none exists. Pod IPs will be
             # chosen from this range. Changing this value after installation will have
             # no effect. This should fall within `--cluster-cidr`.
-            # - name: CALICO_IPV4POOL_CIDR
-            #   value: "192.168.0.0/16"
+            - name: CALICO_IPV4POOL_CIDR
+              value: "10.1.128.0/17"
             # Disable file logging so `kubectl logs` works.
             - name: CALICO_DISABLE_FILE_LOGGING
               value: "true"
@@ -3970,7 +3967,7 @@
         # Used to install CNI.
         - name: cni-bin-dir
           hostPath:
-            path: /opt/cni/bin
+            path: /usr/lib/cni/
         - name: cni-net-dir
           hostPath:
             path: /etc/cni/net.d

MetalLB chart values:

  configInline:
    address-pools:
    - name: default
      protocol: layer2
      addresses:
      - <Public_VIP>/32
  speaker:
    enabled: false

Your Environment

  • Calico version: 3.19.3, 3.20.0 at least
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes 1.21
  • Operating System and version: Archlinux (latest)
  • Cloud provider: oneprovider.com (Region: Paris, which in this case looks like old online.net/scaleway servers rebranded)

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18 (7 by maintainers)

Most upvoted comments

What @apsega describe also perfectly describes the problem I see. When I setup(just setup, via systemd-networked setuping the peers and an IP) the wg1 interface, it stops working 4 or 5 seconds after. I can can get the traffic flowing normally back, doing:

rm /etc/systemd/network/wg1.netdev
rm /etc/systemd/network/wg1.network
systemctl restart systemd-networkd
ip link set down dev wg1
ip link del dev wg1
kubectl -n kube-system rollout restart ds/calico-node

Note: I’ve added a 50-interfaces-exception.network which contains:

[Match]
Name=cali* vxlan.calico

[Link]
Unmanaged=yes

I was about investigating why wireguard would result in such a behaviour but thanks to @apsega (What a timing !), I assume that wireguard is not at fault here

wg1------I: Drop packets with IP options cc @mazdakn