cilium: kube-proxy replacement: LoadBalancer traffic fails from host back to same host
Bug report
General Information
- Cilium version (run
cilium version)
Client: 1.7.4 c7ee6d62b 2020-05-15T16:07:35+02:00 go version go1.13.10 linux/amd64
Daemon: 1.7.4 c7ee6d62b 2020-05-15T16:07:35+02:00 go version go1.13.10 linux/amd64
- Kernel version (run
uname -a)
Linux test03.lan 5.6.2-1.el7.elrepo.x86_64 #1 SMP Thu Apr 2 10:55:54 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
- Orchestration system version in use (e.g.
kubectl version, Mesos, …)
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.3", GitCommit:"2e7996e3e2712684bc73f0dec0200d64eec7fe40", GitTreeState:"clean", BuildDate:"2020-05-20T12:52:00Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T20:55:23Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
- Link to relevant artifacts (policies, deployments scripts, …)
# automatically restarts pods to ensure controlled by new CNI driver
operator:
enabled: true
global:
k8sServiceHost: "127.0.0.1"
k8sServicePort: "6443"
enableXTSocketFallback: false
prometheus:
enabled: true
bpf:
preallocateMaps: true
# disabling not ready for primetime yet
# https://github.com/cilium/cilium/projects/93#column-7748410
installIptablesRules: true
# https://docs.cilium.io/en/latest/architecture/#arch-guide
# https://cilium.io/blog/2019/02/12/cilium-14/#sockmap-bpf-based-sidecar-acceleration-alpha
# https://www.youtube.com/watch?v=ER9eIXL2_14
sockops:
enabled: true
k8s:
# cilium pods will not start on node until pod CIDR has been assigned
requireIPv4PodCIDR: true
# eliminates need for any kind of BGP stuff
# automatically addes routes to each node
autoDirectNodeRoutes: true
tunnel: disabled
kubeProxyReplacement: strict
hostServices:
enabled: true
nodePort:
enabled: true
# dsr or snat
#mode: dsr
mode: snat
externalIPs:
enabled: true
# dev purposes only
cleanState: false
cleanBpfState: true
- Upload a system dump (run
curl -sLO https://github.com/cilium/cilium-sysdump/releases/latest/download/cilium-sysdump.zip && python cilium-sysdump.zipand then attach the generated zip file)
How to reproduce the issue
I’m using cilium with metallb with the kube-proxy replacement. I’ve got a pretty big matrix of scenarios I’m testing and most of them work, but we’ve found a situation where certain traffic fails to be handled. I believe I can distill the issue down to: when traffic leaves N1 (Node 1) and comes back to N1P (Pod running on Node 1) without an snat involved it fails
I’ve tried this with both dsr mode (my intended target) and snat mode (less interested, but wanted to try it out). Both fail under the above circumstances. Her’s a pretty crude representation of what I think the traffic flows are and what works and what fails:
GW = gateway R = router NX = node X NXP = pod running on node X
dsr mode
# service with Cluster externalTrafficPolicy
N1 -> GW -> R -> N2 -> N1P: fail
N1 -> GW -> R -> N1 -> N1P: fail
N2 -> GW -> R -> N1 -> N1P: success
# service with Local externalTrafficPolicy
N2 -> GW -> R -> N1 -> N1P: success
N1 -> GW -> R -> N1 -> N1P: fail
snat mode
# service with Cluster externalTrafficPolicy
N1 -> GW -> R -> N2 -> N1P: success
N1 -> GW -> R -> N1 -> N1P: fail
# service with Local externalTrafficPolicy
N2 -> GW -> R -> N1 -> N1P: success
N1 -> GW -> R -> N1 -> N1P: fail
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 38 (37 by maintainers)
Commits related to this issue
- bpf: split-off LoadBalancer from externalIPs svc type Relying on externalIPs services type for LoadBalancer is not correct, since the expectation is that E-W traffic on a LoadBalancer service must al... — committed to cilium/cilium by borkmann 4 years ago
- bpf: split-off LoadBalancer from externalIPs svc type Relying on externalIPs services type for LoadBalancer is not correct, since the expectation is that E-W traffic on a LoadBalancer service must al... — committed to cilium/cilium by borkmann 4 years ago
- bpf: split-off LoadBalancer from externalIPs svc type Relying on externalIPs services type for LoadBalancer is not correct, since the expectation is that E-W traffic on a LoadBalancer service must al... — committed to cilium/cilium by borkmann 4 years ago
- bpf: split-off LoadBalancer from externalIPs svc type [ upstream commit f8c245889eb36ab2b4693932f1b28e0d6d276efa ] Relying on externalIPs services type for LoadBalancer is not correct, since the exp... — committed to cilium/cilium by borkmann 4 years ago
- bpf: split-off LoadBalancer from externalIPs svc type [ upstream commit f8c245889eb36ab2b4693932f1b28e0d6d276efa ] Relying on externalIPs services type for LoadBalancer is not correct, since the exp... — committed to cilium/cilium by borkmann 4 years ago
Sorry above comments about source IP are complete noise…I thought I was running in
dsrbut was not 😦I’ll go through a deeper round of testing with snat + Local now and report observations.
Yeah that’s correct. It’s deployed as daemon set so unless something is off it should be with dsr yes.
I’ll kick the agents pods in a bit just to confirm it still is in a bad state.
Yeah, I have a bigger matrix to go though still (non-related Pods running on N1 and N2 etc hitting LB IP, Cluster IP, etc) when
Localis fully up…but here’s what I’ve tested so far (sorry for the crude syntax, let me know if something doesn’t make sense).Everything behaves as expected (at least as I expect) with the exception of: dsr or snat + Local + N1 -> LB IP -> N1P
Nice work! Looking forward to the final
Localpiece!This makes me so happy! I’m tied up most of the day today but I’ll definitely have some feedback in the next day or so. This is super high priority for me so I really appreciate the help.