cilium: Pod-to-pod communication not working with IPVLAN
Bug report
General Information
-
Cilium version (run
cilium version
) Client: 1.9.0 dcda3fd98 2020-11-10T13:24:55+01:00 go version go1.15.4 linux/amd64 Daemon: 1.9.0 dcda3fd98 2020-11-10T13:24:55+01:00 go version go1.15.4 linux/amd64 -
Kernel version (run
uname -a
) Linux bert 5.10.0 #1 SMP Mon Feb 15 13:49:14 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux -
Orchestration system version in use (e.g.
kubectl version
, Mesos, …) kubectl version Client Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.2”, GitCommit:“faecb196815e248d3ecfb03c680a4507229c2a56”, GitTreeState:“clean”, BuildDate:“2021-01-13T13:28:09Z”, GoVersion:“go1.15.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“20”, GitVersion:“v1.20.0”, GitCommit:“af46c47ce925f4c4ad5cc8d1fca46c7b77d13b38”, GitTreeState:“clean”, BuildDate:“2020-12-08T17:51:19Z”, GoVersion:“go1.15.5”, Compiler:“gc”, Platform:“linux/amd64”} -
Link to relevant artifacts (policies, deployments scripts, …)
-
Generate and upload a system zip:
curl -sLO https://git.io/cilium-sysdump-latest.zip && python cilium-sysdump-latest.zip
cilium-sysdump-20210218-130424.zip
How to reproduce the issue
- Create a cluster with 2 nodes
- Create 2 pods - one on each node.
- Run NGINX in one pod and fortio curl in the other pod
- TCP connection setup fails.
When I run cilium monitor on both nodes, I only see TCP SYNC messages on the node where fortio is running, but no response from the other node. Also on the other node TCP SYNC packets are not visible (neither using cilium monitor nor tcpdump).
The problem also happens when both pods are on the same node.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 20 (7 by maintainers)
I run in l3 mode
Correct. Higher CPU utilization is solved. For some reason I thought that if I do not have “cilium monitor” command started, the performance impact shall be low even if monitor aggregation is none. Now I can see higher ipvlan performance by about 25% vs. veth. The only unexpected behavior is lower performance (both ipvaln and veth) when setting CONFIG_PREEMPT_NONE=y.
Also, if you can use
ENABLE_REDIRECT_FAST
, then there is no much gain in using ipvlan. See https://cilium.io/blog/2020/11/10/cilium-19#virtual-ethernet-device-optimization-with-ebpf.