cilium: Service CT entries leaking
Is there an existing issue for this?
- I have searched the existing issues
What happened?
Visiting clusterIP in a POD leads to a service CT entry generated in CT map with a long timeout. Because the service CT entry does not see close in INGRESS direction, I think this is a critical bug.
$ cilium bpf ct list global |grep 192.168.15.94
TCP OUT 192.168.15.94:80 -> 172.16.0.16:60462 service expires=113956 RxPackets=0 RxBytes=4209 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=0 TxBytes=0 TxFlagsSeen=0x1b LastTxReport=92356 Flags=0x0012 [ TxClosing SeenNonSyn ] RevNAT=116 SourceSecurityID=0 IfIndex=0
If the visiting is very frequently, such as a high load application. The count of service CT entries will has a significant growth.
$ cilium bpf ct list global |grep 192.168.15.94 |grep service |wc -l
14116
As time goes on, the CT map will be full. Then the new connection will be reset like https://github.com/cilium/cilium/issues/17457
I am going to fix it with the following logic:
- After deNAT, we have the clusterIP, then lookup the CT map again with the new tuple, if the entry is existed, set entry->rx_closing with 1.
Cilium Version
Client: 1.9.0 go version go1.15.4 linux/amd64 Daemon: 1.9.0 go version go1.15.4 linux/amd64
Kernel Version
Linux 4.14.105-19-0019 SMP Fri Jan 15 11:39:34 CST 2021 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Client Version: version.Info{Major:“1”, Minor:“18”, GitVersion:“v1.18.18”, GitCommit:“6b913dbde30aa95b247be30a5318fb912f8fe29e”, GitTreeState:“clean”, BuildDate:“2021-08-11T10:20:21Z”, GoVersion:“go1.15.11”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“18+”, GitVersion:“v1.18.18-57+776098ae2e7bf3-dirty”, GitCommit:“776098ae2e7bf358cce0af0b0faf139fe66c6c48”, GitTreeState:“dirty”, BuildDate:“2021-09-01T07:38:52Z”, GoVersion:“go1.15.11”, Compiler:“gc”, Platform:“linux/amd64”}
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (11 by maintainers)
@BSWANG https://github.com/cilium/cilium/pull/19451 and https://github.com/cilium/cilium/pull/19800 should help you with this problem.
@pchaigno https://github.com/cilium/cilium/blob/master/bpf/lib/conntrack.h#L266