cilium: Graceful shutdown not working when externalTrafficPolicy=Local or when replica count is 1
Is there an existing issue for this?
- I have searched the existing issues
What happened?
- Running in kube proxy replacement strict mode , with graceful termination enabled.
- Create a deployment and LoadBalancer service as per below examples
- Service either has to have externalTrafficPolicy=Local, and only one pod is allowed to run on a node at a given time
- Or if set to Cluster, the deployment must only have one replica running
- To simplify testing and debug, we connect directly to the nodePort for the service and bypass the LB,
- send a message to verify connectivity
- Delete the pod gracefully
- Send another message over the existing connection whilst the pod is in a terminating state
- Client gets a TCP RST. A tcpdump shows the RST originating from the nodeIP:nodePort
I’ve looked at hubble, and monitor logs, and neither show the packet that triggers the RST, nor are any drops shown.
----
apiVersion: v1
kind: Service
metadata:
name: tcp-echo-service
namespace: default
labels:
app: test-echo-server
spec:
externalTrafficPolicy: Local
type: LoadBalancer
ports:
- port: 443
targetPort: 5001
protocol: TCP
name: tcp
selector:
app: test-echo-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-echo-server
namespace: default
labels:
app: test-echo-server
spec:
replicas: 1
selector:
matchLabels:
app: test-echo-server
template:
metadata:
labels:
app: test-echo-server
spec:
terminationGracePeriodSeconds: 3600
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchLabels:
app: test-echo-server
containers:
- name: echo-server
image: "vhiribarren/echo-server:latest"
ports:
- name: tcp
containerPort: 5001
protocol: TCP
lifecycle:
preStop:
exec:
command:
['/bin/sh', '-c', 'echo preStop executing && sleep 3600']
Cilium Version
Client: 1.12.0 9447cd1 2022-07-19T12:22:00+02:00 go version go1.18.4 linux/amd64 Daemon: 1.12.0 9447cd1 2022-07-19T12:22:00+02:00 go version go1.18.4 linux/amd64
Kernel Version
Linux 5.15.0-46-generic #49~20.04.1-Ubuntu SMP Thu Aug 4 19:15:44 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
1.23.5
Sysdump
I can provide selective output from sysdump on request.
Relevant log output
The following output and the log files were gathered after starting termination of the test pod, and before sending a packet which triggers the RST.
//NodePort=4300
>>kubectl exec -it -n kube-system cilium-xg5nt -- cilium service list | grep 4300
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init)
431 10.241.93.59:4300 NodePort 1 => 172.18.2.158:5001 (terminating)
433 172.31.248.52:4300 NodePort 1 => 172.18.2.158:5001 (terminating)
435 0.0.0.0:4300 NodePort 1 => 172.18.2.158:5001 (terminating)
--------
// PodIP=172.18.2.158
>> kubectl exec -it -n kube-system cilium-xg5nt -- cilium bpf ct list global | grep 172.18.2.158
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init)
TCP IN 10.148.82.115:62451 -> 172.18.2.158:5001 expires=7827090 RxPackets=8 RxBytes=555 RxFlagsSeen=0x1a LastRxReport=7805489 TxPackets=7 TxBytes=629 TxFlagsSeen=0x1a LastTxReport=7805489 Flags=0x0030 [ SeenNonSyn NodePort ] RevNAT=0 SourceSecurityID=2 IfIndex=0
TCP OUT 10.148.82.115:62451 -> 172.18.2.158:5001 expires=7827090 RxPackets=7 RxBytes=629 RxFlagsSeen=0x1a LastRxReport=7805489 TxPackets=8 TxBytes=555 TxFlagsSeen=0x1a LastTxReport=7805489 Flags=0x0030 [ SeenNonSyn NodePort ] RevNAT=431 SourceSecurityID=2 IfIndex=3
--------
// NodePort=4300
>> kubectl exec -it -n kube-system cilium-xg5nt -- cilium bpf ct list global | grep 4300
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init)
TCP OUT 10.241.93.59:4300 -> 10.148.82.115:62451 service expires=7827090 RxPackets=0 RxBytes=1444 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=0 TxBytes=0 TxFlagsSeen=0x1a LastTxReport=7805489 Flags=0x0010 [ SeenNonSyn ] RevNAT=431 SourceSecurityID=0 IfIndex=0
TCP OUT 172.31.248.52:34300 -> 172.31.248.74:4240 expires=7827599 RxPackets=236902 RxBytes=19188898 RxFlagsSeen=0x1a LastRxReport=7805999 TxPackets=142137 TxBytes=11844706 TxFlagsSeen=0x1a LastTxReport=7805999 Flags=0x0010 [ SeenNonSyn ] RevNAT=0 SourceSecurityID=0 IfIndex=0
UDP OUT 172.20.0.3:53 -> 172.18.2.130:43007 service expires=7805992 RxPackets=0 RxBytes=162 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=0 TxBytes=0 TxFlagsSeen=0x00 LastTxReport=7805932 Flags=0x0000 [ ] RevNAT=98 SourceSecurityID=0 IfIndex=0
UDP OUT 172.18.2.130:43007 -> 172.18.0.47:53 expires=7805992 RxPackets=1 RxBytes=172 RxFlagsSeen=0x00 LastRxReport=7805932 TxPackets=1 TxBytes=79 TxFlagsSeen=0x00 LastTxReport=7805932 Flags=0x0000 [ ] RevNAT=98 SourceSecurityID=59251 IfIndex=0
--------
Anything else?
endpoint.log agent-log.log filtered on endpointID=27
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 26 (12 by maintainers)
Thanks @aojea Im planning to test this soon, hopefully before the end of the week.
for visibility, I think this may be solved by https://github.com/cilium/cilium/pull/24174 , but I don’t have time to test it, I can confirm that graceful shutdown is working fine with my patch, but I don’t know if there are edge cases
I’m parking it in my queue. @jonahmurphy FYI: I don’t have cycles at the moment, but I’ll try to get to reproducing the issue when I have time.