calico: BGP advertising doesn't honor externalTrafficPolicy Local for LoadBalancer services

For service type LoadBalancer and externalTrafficPolicy: Local, Calico will advertise routes from all nodes, regardless of whether valid endpoints exist.

A calico-node running in an host which has endpoints for that service will learn the endpoint IP address:

ubuntu@deployer:~$ klo -n kube-system calico-node-nsjfq  | grep 'default/my-service'
2022-05-03 22:16:24.089 [DEBUG][78] confd/routes.go 171: getEndpointsForService: service for endpoint not found, passing key="default/my-service"
2022-05-03 22:16:24.135 [DEBUG][78] confd/routes.go 171: getEndpointsForService: service for endpoint not found, passing key="default/my-service"
2022-05-03 22:16:24.200 [DEBUG][78] confd/routes.go 417: Advertising local service svc="default/my-service"
2022-05-03 22:16:24.200 [DEBUG][78] confd/routes.go 206: Checking routes for service advertise=true svc="default/my-service"
2022-05-03 22:16:24.200 [DEBUG][78] confd/routes.go 294: Setting routes for key key="default/my-service" routes=[]string{"10.100.19.72/32"}

On the other hand, a calico-node that has no endpoints for that service will recognize it correctly, as seen in the logs:

ubuntu@deployer:~$ kubectl logs -n kube-system calico-node-zhl2r  | grep 'default/my-service'
2022-05-03 22:16:24.088 [DEBUG][69] confd/routes.go 171: getEndpointsForService: service for endpoint not found, passing key="default/my-service"
2022-05-03 22:16:24.122 [DEBUG][69] confd/routes.go 171: getEndpointsForService: service for endpoint not found, passing key="default/my-service"
2022-05-03 22:16:24.190 [DEBUG][69] confd/routes.go 422: Skipping service with no local endpoints svc="default/my-service"
2022-05-03 22:16:24.190 [DEBUG][69] confd/routes.go 206: Checking routes for service advertise=false svc="default/my-service"

However, both calico-nodes have the same bird static configuration so both of them advertise all the prefixes:

[root@node4 /]# cat /etc/calico/confd/config/bird_aggr.cfg 
# Generated by confd

protocol static {
   # IP blocks for this host.
   route 10.233.111.0/24 blackhole;
   # Static routes.
   route 10.100.18.237/32 blackhole;
   route 10.100.18.40/32 blackhole;
   route 10.100.19.72/32 blackhole;
}

Peer receives routes from all nodes:

10.100.19.72/32      unicast [Node_10_0_1_74 18:28:59.532 from 10.0.1.74] * (100/?) [i]
	via 10.100.16.1 on ens3
                     unicast [Node_10_0_1_130 18:36:22.534 from 10.0.1.130] (100/?) [i]
	via 10.100.16.1 on ens3
                     unicast [Node_10_0_1_254 18:28:59.883 from 10.0.1.254] (100/?) [i]
	via 10.100.16.1 on ens3

Expected Behavior

Only one calico-node should advertise the route so that trafficPolicy is honored, as the documentation does not state otherwise.

Current Behavior

All calico-nodes are advertising the routes, regarding if they have a valid endpoint for the service, similar to externalTrafficPolicy: Cluster

Steps to Reproduce (for bugs)

  1. Setup BGP Service advertising according to documentation.
  2. Create a simple 1-pod deployment.
  3. Create a service of type LoadBalancer and externalTrafficPolicy Local:
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  externalTrafficPolicy: Local
  selector:
    app: nginx
  ports:
    - protocol: UDP
      port: 80
      targetPort: 80
  type: LoadBalancer
  loadBalancerIP: 10.100.19.72
  1. Check the received routes on the BGP peer

Your Environment

  • Calico version: 3.21.2
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes 1.21.1
  • Operating System and version: Ubuntu 20.04

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 2
  • Comments: 15 (14 by maintainers)

Most upvoted comments

  serviceLoadBalancerIPs:
   - cidr: 10.100.18.40/32
   - cidr: 10.100.19.72/32
   - cidr: 10.100.18.237/32

This will be the problem. The way this was imagined to work, the values passed here should be ranges not /32 addresses. Calico advertises any value explicitly placed in that array from all nodes.

On further thinking, we should probably have Calico be a bit smarter for when those ranges are just a single address. Specifically:

  1. If the specified serviceLoadBalancerIPs contain more than one address, advertise the full range from every node.
  2. If the specified serviceLoadBalancerIP is just a single address (/32, /128) then:
  • If the service exists and is type: Local, advertise only from nodes that have the service
  • if the service exists and is type: Cluster, advertise from all nodes
  • if the service does not exist, don’t advertise the IP at all

For me it almost looks like these are being “hard added” when config is applied somehow.

This is exactly right. The semantics right now are:

  • Advertise the prefixes provided in serviceLoadbalanacerIPs unconditionally from all nodes (this handles Cluster type svcs)
  • If a service has externalTrafficPolicy=Local, advertise its address as a /32 route from nodes that have the service running.

This is correct assuming the prefixes in serviceLoadbalancerIPs are not fully qualified (e.g., /32 or /128), but incorrect when using a precisely specified service IP as you are in this case.

To handle /32 or /128 routes, I think the logic needs to be changed to this:

  • Advertise non-fully-qualified serviceLoadbalancerIPs from every node (e.g., not /32 or /128)
  • For fully-qualified serviceLoadbalancerIPs entries, advertise them from every node if the service is cluster-type, otherwise
  • For local-type services, advertise a /32 from every node that hosts the address.

^ This is just a slightly different expression of https://github.com/projectcalico/calico/issues/6074#issuecomment-1137477343