cilium: Conntrack entry mismatch leads to policy enforcement on reply packet for service loopback case
Is there an existing issue for this?
- I have searched the existing issues
What happened?
Steps to reproduce:
minikube start --network-plugin=cni --cni=false --memory=4096 --kubernetes-version=v1.22.8
cilium install --version=v1.10.5
kubectl apply -f - <<EOF
apiVersion: v1
kind: ReplicationController
metadata:
name: guestbook
labels:
k8s-app.guestbook: web
spec:
replicas: 1
selector:
k8s-app.guestbook: web
template:
metadata:
labels:
k8s-app.guestbook: web
spec:
containers:
- image: gcr.io/google-samples/gb-frontend:v6
name: guestbook
ports:
- containerPort: 80
name: http-server
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: guestbook
labels:
k8s-app.guestbook: web
spec:
ports:
- port: 81
protocol: TCP
targetPort: http-server
selector:
k8s-app.guestbook: web
type: ClusterIP
EOF
kubectl apply -f - <<EOF
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: allow-ingress-on-tcp-80
specs:
- endpointSelector:
matchLabels:
k8s-app.guestbook: web
ingress:
- toPorts:
- ports:
- port: "80"
protocol: TCP
EOF
It is important that the service port (81) is different from the container port (80). Issue does not occur if they are the same.
$ kubectl get svc guestbook
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
guestbook ClusterIP 10.98.138.60 <none> 81/TCP 42m
$ kubectl get pods -o wide --selector=k8s-app.guestbook=web
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
guestbook-8b46v 1/1 Running 0 43m 10.0.0.208 minikube <none> <none>
Send traffic to itself via cluster IP:
$ kubectl exec guestbook-8b46v -- curl 10.98.138.60:81
Cilium monitor output:
$ cilium monitor --related-to 1221
Press Ctrl-C to quit
level=info msg="Initializing dissection cache..." subsys=monitor
-> endpoint 1221 flow 0x95218252 identity 22887->22887 state established ifindex lxccbaae0c0d570 orig-ip 169.254.42.1: 169.254.42.1:55518 -> 10.0.0.208:80 tcp SYN
Policy verdict log: flow 0xfef8e31c local EP ID 1221, remote ID 22887, proto 6, ingress, action deny, match none, 10.98.138.60:81 -> 10.0.0.208:55518 tcp SYN, ACK
xx drop (Policy denied) flow 0xfef8e31c to endpoint 1221, identity 22887->22887: 10.98.138.60:81 -> 10.0.0.208:55518 tcp SYN, ACK
Policy verdict log: flow 0x8caf00f9 local EP ID 1221, remote ID 22887, proto 6, ingress, action deny, match none, 10.98.138.60:81 -> 10.0.0.208:55518 tcp SYN, ACK
As per my understanding, a pod sending packets to itself via cluster IP should have skipped policy enforcement all together. Although, the first packet (pod -> itself via cluster IP) is skipped policy enforcement, the reply packet is not skipped (incorrectly subjected to policy enforcement). This is happening because the reply packet do not match any conntrack entries.
$ cilium bpf ct list global | grep 10.0.0.208
TCP IN 169.254.42.1:55522 -> 10.0.0.208:80 expires=17529096 RxPackets=3 RxBytes=222 RxFlagsSeen=0x02 LastRxReport=17529035 TxPackets=6 TxBytes=444 TxFlagsSeen=0x12 LastTxReport=17529035 Flags=0x0008 [ LBLoopback ] RevNAT=7 SourceSecurityID=22887 IfIndex=0
TCP OUT 10.0.0.208:55522 -> 10.98.138.60:80 expires=17529096 RxPackets=0 RxBytes=0 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=3 TxBytes=222 TxFlagsSeen=0x02 LastTxReport=17529035 Flags=0x0008 [ LBLoopback ] RevNAT=7 SourceSecurityID=22887 IfIndex=0
TCP OUT 10.98.138.60:81 -> 10.0.0.208:55522 service expires=17529096 RxPackets=0 RxBytes=7 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=0 TxBytes=0 TxFlagsSeen=0x02 LastTxReport=17529035 Flags=0x0000 [ ] RevNAT=7 SourceSecurityID=0 IfIndex=0
ICMP OUT 10.0.0.208:0 -> 10.98.138.60:0 related expires=17529093 RxPackets=0 RxBytes=0 RxFlagsSeen=0x00 LastRxReport=0 TxPackets=1 TxBytes=74 TxFlagsSeen=0x02 LastTxReport=17529035 Flags=0x0018 [ LBLoopback SeenNonSyn ] RevNAT=7 SourceSecurityID=22887 IfIndex=0
As per my understanding, the second entry is causing the problem:
10.0.0.208:55522 -> 10.98.138.60:80 - the destination IP address is the Cluster IP but the port is container port (80) instead of service port (81).
The reply packet on the other hand has 10.98.138.60:81 -> 10.0.0.208:55518 which does not match the conntrack entry and is incorrectly subjected to network policy enforcement.
Cilium Version
1.10.5
Kernel Version
x86_64
Kubernetes Version
1.22.8
Sysdump
cilium-sysdump-20220411-230655.zip
Relevant log output
No response
Anything else?
No response
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (15 by maintainers)
Links to this issue
- Configure Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- Mengonfigurasi Azure CNI Powered by Cilium di Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- Konfigurieren von Azure CNI Powered by Cilium in Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- Konfigurace Azure CNI Powered by Cilium v Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- Konfigurowanie usługi Azure CNI Powered by Cilium w usłudze Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- Configurare Azure CNI basato su Cilium in servizio Azure Kubernetes (Servizio Azure Kubernetes) - Azure Kubernetes Service | Microsoft Learn
- Konfigurace Azure CNI s technologií Cilium v Azure Kubernetes Service (AKS) - Azure Kubernetes Service | Microsoft Learn
- AKS(Azure Kubernetes Service)에서 Cilium이 제공하는 Azure CNI 구성(미리 보기) - Azure Kubernetes Service | Microsoft Learn
- Configurer Azure CNI Powered by Cilium dans AKS (Azure Kubernetes Service) (préversion) - Azure Kubernetes Service | Microsoft Learn
- 在 Azure Kubernetes 服务 (AKS) 中配置Azure CNI Powered by Cilium(预览版) - Azure Kubernetes Service | Microsoft Learn
The issue is resolved on cilium
1.12.15so most likely fixed by https://github.com/cilium/cilium/pull/27798This may be fixed by https://github.com/cilium/cilium/pull/27798
If I understand the issue correctly, we already have a test case to verify this scenario - https://github.com/cilium/cilium/blob/master/test/k8s/services.go#L170-L170.
If this is indeed relevant, we should extend the test case. @skmatti Here is context for how we skip the policy enforcement for the service loopback + policy enforcement case - https://github.com/cilium/cilium/commit/52cd6da139c1ac5d67de65a821f953c936034f2e. This hopefully helps in debugging/fixing the issue further.