envoy: Slow start mode "expires" when more members added to the cluster.

Title: Slow start mode “expires” when more members added to the cluster.

Description:

When a new member joins the cluster and slow start is configured the current pod which still should be in slow start are getting all the request that they would as if the slow start window finished already. One picture is worth a thousand words so please see the attached screenshot:

Screenshot 2023-08-14 at 16 57 57

You can see that the red line starts gradually which is what I’d expect from a slow start however once the new pod (member) is added (the green line) the amount of requests directed to the red line suddenly jumps. The slow starts is configured for 300s but the “jolt” happens within a bit more than one minute from the start.

Repro steps:

kubectl scale --replicas=2 deploy/aiohttp-service; sleep 75; kubectl scale --replicas=3 deploy/aiohttp-service

   "ISTIO_VERSION": "1.17.3",
   "version": "43cd81989c499d2382fd80ac077b14d728fe41eb/1.25.7-dev/Clean/RELEASE/BoringSSL",

edit: fixed my english a bit

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 5
  • Comments: 19 (9 by maintainers)

Most upvoted comments

@palmobar definitely the docs need to be updated. Will do it next week.

@nezdolik so am I right in thinking that (see edit below) istio turns on locality load balancing by default which causes the slow start to be calculated separately for each locality?

After adjusting my DestinationRule in the following way the slow start feature works exactly as described:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: aiohttp-service
  namespace: default
spec:
  host: aiohttp-service
  trafficPolicy:
    loadBalancer:
      localityLbSetting:
        enabled: false
      simple: ROUND_ROBIN
      warmupDurationSecs: 180s

Edit: Looks like I’m right

I think you might be onto something here, look at this graph:

Screenshot 2023-09-27 at 17 52 31

The first two lines are:

  • red: zone eu-west-1c
  • green: zone eu-west-1a

The third line (a new pod introduced to the cluster):

  • teal: zone eu-west-1b

this line is jumping up to receive 33% of the traffic immediatelly (no warmup period applies) but the other pods decrease by equal amounts so it’s like warm up was never in effect.

The fourth line (purple) is the same zone as the red line:

  • purple: zone eu-west-1c

You can see that the red line “jumps up” but also the purple line will get more traffic than the other pods, in here it looks like the warm up is in effect (to a degree at least).

The gold line is the same zone as purple and red:

  • gold: zone eu-west-1c

In here it looks like warmup is in effect, red and. purple pods “jump up” (same zone) other pods gradually lose share of the traffic.

/assign

Thanks for the additional information @palmobar, I will look closely this week.