antrea: ServiceLoadBalancer + externalTrafficPolicy: Local = Connection Refused most of time
Describe the bug I have a 5 node cluster with a deployment with 2 replicas. The deployment uses the ServiceExternalIP feature of antrea. I can see that the service got an IP from the same node network
whoami-headless LoadBalancer 10.239.37.140 10.1.2.220 80:31591/TCP 5s
But if I try to curl 10.1.2.220 it will work for some remote endpoints, but not for others. It works just fine if I set the externalTrafficPolicy=Cluster, but that way I will lose the client IP
To Reproduce Enable Service IP to any service with a single replica deployed and use the yamls at the end of this bug report.
Expected I suppose the loadblancer IP would be “acquired” by one of the nodes running the pod and it should work.
Actual behavior Most of times I get connection refused and in other just works. From the masters it works using the clusterIP but it fails using the LoadBalancer IP.
# curl 10.239.37.140 -I
HTTP/1.1 200 OK
Date: Fri, 13 May 2022 17:35:05 GMT
Content-Length: 203
Content-Type: text/plain; charset=utf-8
# curl 10.1.2.220 -I
curl: (7) Failed to connect to 10.199.0.220 port 80: Connection refused
Versions:
- Antrea version (Docker image tag). 1.6.1
- Kubernetes version (use
kubectl version
). 1.22.8 - Container runtime: cri-o
- Linux kernel version on the Kubernetes Nodes (
uname -r
). 4.18.0-348.23.1.el8_5.x86_64
apiVersion: crd.antrea.io/v1alpha2
kind: ExternalIPPool
metadata:
name: service-external-ip-pool
spec:
ipRanges:
- start: 10.1.1.220
end: 10.1.2.250
nodeSelector: {}
---
apiVersion: v1
kind: Service
metadata:
name: whoami-headless
annotations:
service.antrea.io/external-ip-pool: "service-external-ip-pool"
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- protocol: TCP
port: 80
targetPort: 80
selector:
app: whoami
Which node does get the LoadBalancer IP assigned to? I can see the LoadBalancer IP assigned to the kube-ipvs0 interface on all servers, but I suppose only one is really using, otherwise it would be an IP conflict situation, wouldn’t it ? I will check metallb to see If I get the same behaviour or not.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 30 (16 by maintainers)
Good to know you plan to use the feature in production! We can definitely prioritize the Egress fix. I created an issue to track that: #3804
I’ll reopen this issue since there is a documentation change required. And I will assign @jianjuns since he volunteered 😃
@jianjuns Antrea’s implementation does respect
externalTrafficPolicy
: https://github.com/antrea-io/antrea/blob/2526b1f8c9f3b70933ef96dab89e034db660574d/pkg/agent/controller/serviceexternalip/controller.go#L363-L371 I think @antoninbas is correct. The access is supposed to fail if the traffic towards a service withLocal
externalTrafficPolicy reaches a Node that doesn’t have any backends of the service. I haven’t tried but I suppose MetalLB is same.