istio: Istio ingress still forwarding request to a pod that got marked terminated 9mins ago

Bug Description

Hi Team, We are running Istio (1.17.1-distroless) in our EKS cluster, we found out that during the event when a replicaset scaled down rapidly, from 350 Pods to 180 in around 10 mins, we faced a lot of 503s. This we beleive is due to the fact that the istio ingressgateway still makes a request to the terminated pod’s proxy, even after 9 minutes of the Endpoints update. The Pod count in the EKS cluster is 3K and Nodes being ~250. There are about 200 services too.

Here is a chronology of the events:

The pod(*-rbzcp) got marked for deletion by the replicaset controller at 2023-05-17T02:39:02.754863Z
The IP of the pod got updated in the endpoint 2023-05-17T02:39:02.891734Z
The pod got deleted from the node’s kubelet at 2023-05-17T02:42:08.759850Z

The istio proxy logs on the same pod:

2023-05-17T02:42:07.910757Z	warn	Aborted proxy instance
2023-05-17T02:42:07.910766Z	warn	Aborting proxy
2023-05-17T02:42:07.910953Z	info	Envoy aborted normally
2023-05-17T02:42:07.910966Z	info	Agent has successfully terminated
2023-05-17T02:42:07.911016Z	error	error serving tap http server: http: Server closed
2023-05-17T02:42:07.911374Z	info	ads	ADS: "@"<REDACTED>-1 terminated
2023-05-17T02:42:07.911388Z	info	ads	ADS: "@" <REDACTED>-2 terminated
2023-05-17T02:42:07.911402Z	info	sds	SDS server for workload certificates started, listening on "./var/run/secrets/workload-spiffe-uds/socket"

There is no log from the pod in question(*-rbzcp), after this. The image shows that after marked terminated, the ingress continued to forward requests to the pod, getting 503s.

At this time, the pod count of this deployment reduced as shown below (HPA scale-in), from 330 to 180 over 15 mins. (Time in EST rather than UTC) Screenshot 2023-05-18 at 17 45 28

We are running 50 replicas of Istiod but they seem to throttled at that time (please not that throttling is there at other occurrences too, but we are not seeing 503/outages) Screenshot 2023-05-18 at 17 50 20

Version

istioctl version:
client version: 1.15.3
control plane version: 1.17.1
data plane version: 1.17.1 (59 proxies)

 kubectl version --short
Client Version: v1.23.6
Server Version: v1.23.16-eks-48e63af

helm version --short
v3.6.3+gd506314

Additional Information

Istiod logs for that pod:

2023-05-17T02:42:07.911631Z	info	ads	ADS: "10.64.5.57:36606" <REDACTED>-rbzcp.<REDACTED>-27184 terminated
2023-05-17T02:32:53.743852Z	info	ads	RDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:10/21
2023-05-17T02:32:52.883771Z	info	ads	NDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T02:32:51.939697Z	info	ads	LDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T02:32:51.929958Z	info	ads	EDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:587 size:1.1MB empty:0 cached:586/587
2023-05-17T02:32:50.755431Z	info	ads	ADS: "10.64.5.57:48574" <REDACTED>-rbzcp.<REDACTED>-27024 terminated
2023-05-17T02:32:51.001296Z	info	ads	CDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:644/650
2023-05-17T02:32:50.995933Z	info	ads	ADS: new connection for node:<REDACTED>-rbzcp.<REDACTED>-27184
2023-05-17T02:23:28.027324Z	info	ads	NDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T02:23:28.023420Z	info	ads	RDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:12/21
2023-05-17T02:23:28.001766Z	info	ads	LDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T02:23:27.996641Z	info	ads	EDS: PUSH INC for node:<REDACTED>-rbzcp.<REDACTED> resources:2 size:155.6kB empty:0 cached:0/2
2023-05-17T02:23:27.167235Z	info	ads	CDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:650/650
2023-05-17T02:18:49.324106Z	info	ads	NDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T02:18:49.320813Z	info	ads	RDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:12/21
2023-05-17T02:18:49.297607Z	info	ads	LDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T02:18:49.291275Z	info	ads	EDS: PUSH INC for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:143.5kB empty:0 cached:0/1
2023-05-17T02:18:48.444070Z	info	ads	CDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:645/650
2023-05-17T02:16:20.119399Z	info	ads	NDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T02:16:20.115814Z	info	ads	RDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:12/21
2023-05-17T02:16:20.091425Z	info	ads	LDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T02:16:20.084363Z	info	ads	EDS: PUSH INC for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:190.7kB empty:0 cached:0/1
2023-05-17T02:16:19.217966Z	info	ads	CDS: PUSH for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:650/650
2023-05-17T02:02:31.076854Z	info	ads	ADS: "10.64.5.57:34598" <REDACTED>-rbzcp.<REDACTED>-27211 terminated
2023-05-17T02:02:31.207645Z	info	ads	RDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:12/21
2023-05-17T02:02:31.181192Z	info	ads	NDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T02:02:31.180526Z	info	ads	LDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T02:02:31.171874Z	info	ads	EDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:587 size:1.1MB empty:0 cached:587/587
2023-05-17T02:02:31.167626Z	info	ads	CDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:629/650
2023-05-17T02:02:31.162313Z	info	ads	ADS: new connection for node:<REDACTED>-rbzcp.<REDACTED>-27024
2023-05-17T01:31:21.527483Z	info	ads	RDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:21 size:1.6MB cached:12/21
2023-05-17T01:31:21.482855Z	info	ads	NDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:1 size:27.2kB
2023-05-17T01:31:21.482188Z	info	ads	LDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:63 size:291.3kB
2023-05-17T01:31:21.473726Z	info	ads	EDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:587 size:1.1MB empty:0 cached:587/587
2023-05-17T01:31:21.470405Z	info	ads	CDS: PUSH request for node:<REDACTED>-rbzcp.<REDACTED> resources:654 size:837.2kB cached:624/650
2023-05-17T01:31:21.465776Z	info	ads	ADS: new connection for node:<REDACTED>-rbzcp.<REDACTED>-27211
2023-05-17T01:31:21.368391Z	info	ads	ADS: "10.64.5.57:33552" <REDACTED>-rbzcp.<REDACTED>-26893 terminated

Affected product area

Ambient
Docs
Installation
Networking
Performance and Scalability
Extensions and Telemetry
Security
Test and Release
User Experience
Developer Infrastructure
Upgrade
Multi Cluster
Virtual Machine
Control Plane Revisions

About this issue

Original URL
State: closed
Created a year ago
Comments: 61 (47 by maintainers)

Commits related to this issue

eds: simplify and optimize mTLS checker * Remove all caching. This is expensive and offers little benefit; the issue here is the mtlsChecker is built per endpointBuilder, which is per-proxy-cluster... — committed to howardjohn/istio by howardjohn a year ago
eds: simplify and optimize mTLS checker (#45230) * eds: simplify and optimize mTLS checker * Remove all caching. This is expensive and offers little benefit; the issue here is the mtlsChecker is b... — committed to istio/istio by howardjohn a year ago
pilot: autoscale concurrency and rate limit on xds This is an attempt to improve the defaults here. 25 for RL is decent, but for giant clusters we have seen folks report its too small. The bigge... — committed to howardjohn/istio by howardjohn 6 months ago
pilot: autoscale concurrency and rate limit on xds This is an attempt to improve the defaults here. 25 for RL is decent, but for giant clusters we have seen folks report its too small. The bigge... — committed to howardjohn/istio by howardjohn 6 months ago
pilot: autoscale concurrency and rate limit on xds This is an attempt to improve the defaults here. 25 for RL is decent, but for giant clusters we have seen folks report its too small. The bigge... — committed to howardjohn/istio by howardjohn 6 months ago
pilot: autoscale concurrency and rate limit on xds This is an attempt to improve the defaults here. 25 for RL is decent, but for giant clusters we have seen folks report its too small. The bigge... — committed to howardjohn/istio by howardjohn 6 months ago
pilot: autoscale concurrency and rate limit on xds (#48438) This is an attempt to improve the defaults here. 25 for RL is decent, but for giant clusters we have seen folks report its too small. ... — committed to istio/istio by howardjohn 6 months ago

Most upvoted comments

Some other ideas:

Refactor queue to dedupe. no need to have events [update endpoints to {a,b},update endpoints to {a}]; this can be collapsed down to a single event.
Completely remove locks, use atomic map instead. This almost works: https://github.com/howardjohn/istio/pull/new/xds/drop-locking-master-nolock. Might be able to make it fully work with some more efforts

howardjohn on May 31, 2023

Sorry markdown was messed up. Was just commenting I tried tuning GOMAXPROCS on my own cluster and saw no benefits.

howardjohn on May 31, 2023