cilium: Cilium is droping ICMP packets (Invalid packet)

Bug report

When I deploy cilium, ICMP is stoping working on any cluster node, If I running cilium monitor -t drop I see that it is dropping EchoReply packets:

xx drop (Invalid packet) flow 0x318a7847 to endpoint 0, identity 4294967162->0: 10.28.0.1 -> 10.28.36.174 EchoReply

I’ve tried to deploy with default settings from quick start guide and with kubeProxyReplacement=strict.

ICMP is working fine only if I setting global.kubeProxyReplacement=disabled option.

General Information

  • Cilium version (run cilium version):
    Client: 1.8.2 aa42034f0 2020-07-23T15:02:39-07:00 go version go1.14.6 linux/amd64
    Daemon: 1.8.2 aa42034f0 2020-07-23T15:02:39-07:00 go version go1.14.6 linux/amd64
    
  • Kernel version (run uname -a)
    Linux m1c31 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
    
  • Orchestration system version in use (e.g. kubectl version, Mesos, …):
    Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:58:59Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:51:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
    
  • Link to relevant artifacts (policies, deployments scripts, …)
  • Generate and upload a system zip:

How to reproduce the issue

  1. I’m trying cilium on pxe bootable server farm (kubefarm ) the image is based on Ubuntu 20.04 and its standard kernel.
  2. install cilium:
    helm upgrade cilium cilium/cilium --version 1.8.2 \
      --namespace kube-system \
      --set global.kubeProxyReplacement=strict \
      --set global.k8sServiceHost=cluster1-kubernetes-apiserver --set global.k8sServicePort=6443 
    
  3. go to any node and try to ping anything

I also tried to collect some debug with --debug-verbose=datapath and cilium monitor -v:

debug log
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: Conntrack lookup 1/2: src=10.0.0.115:4240 dst=10.0.4.32:44320
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: CT entry found lifetime=22460, revnat=0
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: CT verdict: Reply, revnat=0
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: Successfully mapped addr=10.0.4.32 to identity=6
CPU 02: MARK 0x6592b0b7 FROM 1489 DEBUG: Encapsulating to node 169616560 (0xa1c24b0) from seclabel 4
-> overlay flow 0x6592b0b7 identity 4->0 state new ifindex cilium_vxlan orig-ip 0.0.0.0: 10.0.0.115:4240 -> 10.0.4.32:44320 tcp ACK
CPU 02: MARK 0x6592b0b7 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:56273 dst=10.28.36.176:8472
CPU 02: MARK 0x6592b0b7 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=17 flags=1
CPU 02: MARK 0x6592b0b7 FROM 2373 DEBUG: CT entry found lifetime=920, revnat=0
CPU 02: MARK 0x6592b0b7 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 02: MARK 0xc24a7312 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.176 to identity=0
CPU 05: MARK 0x6c1c79d FROM 2373 DEBUG: Successfully mapped addr=10.28.36.176 to identity=0
CPU 05: MARK 0x6c1c79d FROM 0 DEBUG: Tunnel decap: id=6 flowlabel=0
CPU 05: MARK 0x6c1c79d FROM 0 DEBUG: Attempting local delivery for container id 1489 from seclabel 6
CPU 05: MARK 0x6c1c79d FROM 1489 DEBUG: Conntrack lookup 1/2: src=10.0.4.32:44320 dst=10.0.0.115:4240
CPU 05: MARK 0x6c1c79d FROM 1489 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 05: MARK 0x6c1c79d FROM 1489 DEBUG: CT entry found lifetime=22475, revnat=0
CPU 05: MARK 0x6c1c79d FROM 1489 DEBUG: CT verdict: Established, revnat=0
-> endpoint 1489 flow 0x6c1c79d identity 6->4 state established ifindex lxc_health orig-ip 10.0.4.32: 10.0.4.32:44320 -> 10.0.0.115:4240 tcp ACK
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
xx drop (Invalid packet) flow 0xd2e687fb to endpoint 0, identity 4294967162->0: 10.28.0.1 -> 10.28.36.174 EchoReply
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22471, revnat=0
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 03: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Successfully mapped addr=10.28.36.70 to identity=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.70:6443 dst=10.28.36.174:40554
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x59425406 FROM 2373 DEBUG: CT verdict: Reply, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 1/2: src=10.28.36.174:40554 dst=10.28.36.70:6443
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT entry found lifetime=22477, revnat=0
CPU 06: MARK 0x55e677e4 FROM 2373 DEBUG: CT verdict: Established, revnat=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
xx drop (Invalid packet) flow 0xd2e687fb to endpoint 0, identity 4294967162->0: 10.28.0.1 -> 10.28.36.174 EchoReply
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Inheriting identity=1 from stack
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Attempting local delivery for container id 2470 from seclabel 1
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.207:49708 dst=10.0.0.58:8181
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT verdict: New, revnat=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack create: proxy-port=0 revnat=0 src-identity=1 lb=0.0.0.0
-> endpoint 2470 flow 0x17ece03b identity 1->5843 state new ifindex lxcb20428011962 orig-ip 10.0.0.207: 10.0.0.207:49708 -> 10.0.0.58:8181 tcp SYN
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.58:8181 dst=10.0.0.207:49708
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: CT entry found lifetime=939, revnat=0
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: CT verdict: Reply, revnat=0
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
-> stack flow 0x8179cafa identity 5843->1 state reply ifindex 0 orig-ip 0.0.0.0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp SYN, ACK
-> 0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp SYN, ACK
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Inheriting identity=1 from stack
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Attempting local delivery for container id 2470 from seclabel 1
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.207:49708 dst=10.0.0.58:8181
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT entry found lifetime=939, revnat=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT verdict: Established, revnat=0
-> endpoint 2470 flow 0x17ece03b identity 1->5843 state established ifindex lxcb20428011962 orig-ip 10.0.0.207: 10.0.0.207:49708 -> 10.0.0.58:8181 tcp ACK
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Inheriting identity=1 from stack
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Attempting local delivery for container id 2470 from seclabel 1
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.207:49708 dst=10.0.0.58:8181
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT verdict: Established, revnat=0
-> endpoint 2470 flow 0x17ece03b identity 1->5843 state established ifindex lxcb20428011962 orig-ip 10.0.0.207: 10.0.0.207:49708 -> 10.0.0.58:8181 tcp ACK
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.58:8181 dst=10.0.0.207:49708
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: CT verdict: Reply, revnat=0
CPU 01: MARK 0x8179cafa FROM 2470 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
-> 0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.58:8181 dst=10.0.0.207:49708
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: CT verdict: Reply, revnat=0
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
-> stack flow 0x8179cafa identity 5843->1 state reply ifindex 0 orig-ip 0.0.0.0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK
-> 0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Inheriting identity=1 from stack
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
CPU 01: MARK 0x17ece03b FROM 2373 DEBUG: Attempting local delivery for container id 2470 from seclabel 1
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.58:8181 dst=10.0.0.207:49708
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: CT verdict: Reply, revnat=0
CPU 04: MARK 0x8179cafa FROM 2470 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
-> stack flow 0x8179cafa identity 5843->1 state reply ifindex 0 orig-ip 0.0.0.0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK, FIN
-> 0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK, FIN
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.207:49708 dst=10.0.0.58:8181
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 01: MARK 0x17ece03b FROM 2470 DEBUG: CT verdict: Established, revnat=0
CPU 02: MARK 0x17ece03b FROM 2373 DEBUG: Inheriting identity=1 from stack
CPU 02: MARK 0x17ece03b FROM 2373 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
CPU 02: MARK 0x17ece03b FROM 2373 DEBUG: Attempting local delivery for container id 2470 from seclabel 1
CPU 02: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.207:49708 dst=10.0.0.58:8181
CPU 02: MARK 0x17ece03b FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=0
CPU 02: MARK 0x17ece03b FROM 2470 DEBUG: CT entry found lifetime=22479, revnat=0
CPU 02: MARK 0x17ece03b FROM 2470 DEBUG: CT verdict: Established, revnat=0
-> endpoint 2470 flow 0x17ece03b identity 1->5843 state established ifindex lxcb20428011962 orig-ip 10.0.0.207: 10.0.0.207:49708 -> 10.0.0.58:8181 tcp ACK, FIN
CPU 02: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 1/2: src=10.0.0.58:8181 dst=10.0.0.207:49708
CPU 02: MARK 0x8179cafa FROM 2470 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 02: MARK 0x8179cafa FROM 2470 DEBUG: CT entry found lifetime=889, revnat=0
CPU 02: MARK 0x8179cafa FROM 2470 DEBUG: CT verdict: Reply, revnat=0
CPU 02: MARK 0x8179cafa FROM 2470 DEBUG: Successfully mapped addr=10.0.0.207 to identity=1
-> 0: 10.0.0.58:8181 -> 10.0.0.207:49708 tcp ACK
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 07: MARK 0xde444e0f FROM 2373 DEBUG: Successfully mapped addr=10.28.36.20 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0
CPU 04: MARK 0xd829242c FROM 2373 DEBUG: Successfully mapped addr=10.28.20.107 to identity=0

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 34 (33 by maintainers)

Commits related to this issue

Most upvoted comments

replace the linked revalidate_data(…) with revalidate_data_first(…)?

working, no packet drops anymore

So, the brb0/cilium:skb-pull-v0 image has resolved all cases

It is, thank you!

Cool, thanks for debugging this! Let us discuss tomorrow a possible fix.

Ok, let’s start from beginning. I have cleanly bootstrapped nodes on ubuntu 20.04 with 5.4.0-42-generic kernel.

Then I deploy cilium using helm:

helm upgrade --install cilium cilium/cilium --version 1.8.2 \
  --namespace kube-system \
  --set global.kubeProxyReplacement=strict \
  --set global.k8sServiceHost=cluster1-kubernetes-apiserver --set global.k8sServicePort=6443 

result: ping is not working

Then I deploy cilium with --set global.kubeProxyReplacement=disabled

helm upgrade --install cilium cilium/cilium --version 1.8.2 \
  --namespace kube-system \
  --set global.kubeProxyReplacement=disabled \
  --set global.k8sServiceHost=cluster1-kubernetes-apiserver --set global.k8sServicePort=6443 

and do:

kubectl delete pod -n kube-system -l k8s-app=cilium

result: ping is working

Then I deploy cilium with

helm upgrade --install cilium cilium/cilium --version 1.8.2 \
  --namespace kube-system \
  --set global.kubeProxyReplacement=strict \
  --set global.k8sServiceHost=cluster1-kubernetes-apiserver --set global.k8sServicePort=6443 \
  --set global.tunnel=disabled \
  --set global.autoDirectNodeRoutes=true \
  --set global.nativeRoutingCIDR=10.112.0.0/12 \
  --set global.ipam.operator.clusterPoolIPv4PodCIDR=10.112.0.0/12 \
  --set global.ipam.operator.clusterPoolIPv4MaskSize=24

and do:

kubectl delete pod -n kube-system -l k8s-app=cilium

result: ping is not working

Then I deploy cilium with image from https://github.com/cilium/cilium/issues/12854#issuecomment-675525385

kubectl set image -n kube-system ds/cilium cilium-agent=brb0/cilium:skb-pull-v0
kubectl delete pod -n kube-system -l k8s-app=cilium

result: ping is working

Then I drploy cilium with original v1.8.2 image, but with IPv6 enabled:

kubectl delete ds -n kube-system cilium
helm upgrade --install cilium cilium/cilium --version 1.8.2 \
  --namespace kube-system \
  --set global.kubeProxyReplacement=strict \
  --set global.k8sServiceHost=cluster1-kubernetes-apiserver --set global.k8sServicePort=6443 \
  --set global.ipv6.enabled=true \
  --set global.k8s.requireIPv6PodCIDR=false \
  --set global.tunnel=disabled \
  --set global.autoDirectNodeRoutes=true \
  --set global.nativeRoutingCIDR=10.112.0.0/12 \
  --set global.ipam.operator.clusterPoolIPv4PodCIDR=10.112.0.0/12 \
  --set global.ipam.operator.clusterPoolIPv4MaskSize=24 \
  --set global.ipam.operator.clusterPoolIPv6PodCIDR=fd00::/104 \
  --set global.ipam.operator.clusterPoolIPv6MaskSize=112

result: ping is not working

I also have another machine with the same hardware, but another OS: debian 10 and 5.4.44-2-pve. I was installing cilium there, and had no this issue, maybe because the main interface is added into bond and bond interface is added to the linux bridge. So cilium uses linux bridge instead raw interface.

Hmm, interesting. Could you repeat the experiment with cilium_dbg by adding to to-netdev in bpf_host.c and to-overlay in bpf_overlay.c, but this time by using the https://github.com/cilium/cilium/tree/pr/brb/pull_skb branch?

This branch is not accessible to me

@brb this image is working fine🎉

brb0/cilium:skb-pull-v0

(digest sha256:8415801ca4eaa53078b63b86ef5c7cb53e29c3a225b78d28161476c36e1f8f7c)

@kvaps Can you please check whether the issue is resolved with the brb0/cilium:skb-pull-v0 Docker image?

Thanks for the confirmation @kvaps this really helps! This means that mlx4 only has eth header in linear skb section and we need to pull-in up to L4 hdr in order for this to work.

For now can you replace the linked revalidate_data(..) with revalidate_data_first(..)?

Any ideas how can I debug this?

I’d suggest to add the cilium_dbg(..) before each relevant DROP_INVALID, and see with cilium monitor -v which path leads to the drops.