calico: Why can't access local node service using NodePort by eBPF mode on arm64

I have a 2 nodes k8s cluster. After enabling eBPF mode in Felix with guide(https://projectcalico.docs.tigera.io/maintenance/ebpf/enabling-bpf), I deployed a simple k8s Nodeport service and backend Nginx pods with replicas 2. It seems I can’t access the k8s nodeport service from local node itself, but it can be accessed from other nodes. I checked for the original behavior of kube-proxy, it can be accessed locally and correctly. So the 2 behaviors here are different. I wonder if there is any intentional design or setting here for disabling the local access for NodePort service with eBPF. or anything I had missed here for eBPF mode setting.

Steps to Reproduce

The deployment yaml file: trevor@vm3-arm-tx2-02:~/projects/k8s-cilium-examples$ cat nginx-app-deployment.yaml `apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: type: NodePort ports:

  • port: 80 protocol: TCP name: http selector: app: nginx

apiVersion: apps/v1 kind: Deployment metadata: name: nginx spec: replicas: 2 selector: matchLabels: app: nginx template: metadata: labels: app: nginx role: backend spec: containers: - name: nginx image: nginx ports: - containerPort: 80`

The 2 nodes here: 10.169.210.108(node1), 10.169.210.109(node2) The generated k8s service: nginx NodePort 172.16.1.3 <none> 80:31064/TCP 48m

From the node1: $curl 10.169.210.108:31064 curl: (7) Failed to connect to 10.169.210.108 port 31064: Connection refused $ curl 10.169.210.109:31064

<!DOCTYPE html> <html> <head> <title>Welcome to nginx!</title> ...

I checked with tcpdump: sudo tcpdump -i any port 31064 The captured packets show a TCP RST ACK had been sent: image

Calico version: 3.22.1

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 15 (15 by maintainers)

Most upvoted comments

I think I had found the root cause of this issue:

  1. It’s mainly due to the error: " libbpf: prog ‘calico_connect_v4’: failed to attach to cgroup: Invalid argument" which was showed in calico-node log output.
  2. Mainly it’s only existed in Linux kernel v5.4 (and less than), and the syscall to BPF_LINK_CREATE is available in the kernel v5.8 (or above), which is called by bpf_program_attach_cgroup.
  3. There is a bug in Calico v3.22.1, for the return value of bpf_program_attach_cgroup() will always be true, so it will not call bpf_program_attach_cgroup_legacy() which is needed for kernel v5.4(and less than).
  4. We checked the nodePort service after merging the commit bd9ec6581f0a294d16dbc6b997645cd805684195 which is available in Calico v3.23.1, it can be accessed correctly on the local node now.
  5. Both x86_64 and arm64 show the same result (nodePort service for local node not available) for kernel v5.4 and before the commit bd9ec6581f0a294d16dbc6b997645cd805684195.