istio: Memory leak in Sidecar, istio 1.19 and 1.20

Is this the right place to submit this?

  • This is not a security vulnerability or a crashing bug
  • This is not a question about how to use Istio

Bug Description

Creating a ticket for https://istio.slack.com/archives/C05C7FULQAG/p1700759075429959?thread_ts=1700065931.240079&cid=C05C7FULQAG on slack to discuss what looks like a memory leak in the sidecar since we upgraded to 1.19.

image

It only really appears on our highest throughput services.

I’ve pinged @howardjohn a pprof.

This appears to be in 1.20 too, although we only have 1.20 in our preprod environment so much lower volume.

Version

1.19.4

Additional Information

No response

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Reactions: 1
  • Comments: 20 (19 by maintainers)

Most upvoted comments

Ah @hanxiaop thanks - I think it has to do with the interval reporting and some expressions keep getting re-evaluated. I can see it growing and will try to make a patch.

If you can share telemetry config (or config_dump) for the affected service, and describe the traffic briefly, I can try to replicate the leak locally. There might be some combination of “tag overrides” that causes a leak, but it’s difficult to pin point it.

I’m using 1.19.3. The current behavior I’ve observed is: when I apply the same telemetry configuration to both an HTTP service and a TCP service, the HTTP traffic does not have any memory leak. However, for the TCP service, when I have 100 clients, 1000 tps, and a 10KB response body, the Envoy memory usage slowly increases from 60Mi to 260Mi over several hours.