cilium: EKS NodePort to local node with kube-proxy is broken
When running in native ENI mode, NodePort on the local node is broken:
---------------------------------------------------------------------------------------------------------------------
🔌 [pod-to-service] Testing cilium-test/client-6c46bff775-9hw7v -> 192.168.58.215:32350 (NodePort)...
---------------------------------------------------------------------------------------------------------------------
❌ curl connectivity check command failed: error in stream: command terminated with exit code 28
❌ [pod-to-service] cilium-test/client-6c46bff775-9hw7v (192.168.132.242) -> 192.168.58.215:32350 (NodePort) (192.168.58.215:32350)
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 22 (17 by maintainers)
Hmm, I was able to run the DSR on AWS (e.g., https://github.com/brb/cilium-without-netfilter).
@dctrwatson Can you provide a sysdump?
Clients get responses like “curl: (56) Recv failure: Operation timed out” and we can see some traffic with tcpdump port <NodePort> on the nodes with client IP. On one node it is similar to legitimate traffic with Flags [S] [S.] [.] [P.] [F.] but on another one it is just Flags [S] on the NodePort. (Seems it depends on where the pod is working and was the node launched before or after cilium was installed.)
After recreating nodes accessibility from outside the cluster: if the Service has externalTrafficPolicy: Local -> “Connection timed out” still from all nodes NodePort; if externalTrafficPolicy: Cluster -> “Connection timed out” only on the node where the pod is running.
Nodes NodePort accessibility from another nodes of the cluster is working.
Looks like @AnishShah already confirmed the issue on EKS - https://github.com/cilium/cilium/issues/14765#issuecomment-768843452. I’ll close this issue as we are already tracking the fix via #14765.