envoy: Memory leak as a Istio sidecar

Title: *Memory leak as a Istio sidecar *

Description: We are migrating some services to Istio. However, we found many service are in memory leaks at different rates. below you can see graphs showing istio-proxy container memory utilization metrics:

image

I enable heap profiler for few hours at that time. And I get a flame graph about inuse_space:

image

This service has some inbound or outbound grpc requests, and outbound pass-through http1.1 requests with tls. I suspected this is a connection leaks at first. But netstat -anlp | wc -l shows connection number is always maintained at around 600.

Repro steps: It can be reproduce on our some services.

Admin and Stats Output: Sorry, I forgot getting the metric. I will put stats on here when memory get larger again.

Config: If needs, I can provide some configuration. Logs: No logs

Call Stack: No crash

Envoy version: 6ccb2c35a7a22f81181d87dc93d93d9ee48749af/1.19.2-dev/Clean/RELEASE/BoringSSL

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 28 (27 by maintainers)

Most upvoted comments

@Patrick0308 Yes, it’s fairly tricky to solve. If original_dst deletes a record for a host and then hands out another Host for the same address, the pools will accumulate multiple connections to the same host (since they are based on pointer value of Host).

But that’s not really the issue here, the leak is caused by double write when adding Host from two workers, the second write will be ignored and not update priority_set_ with a new synthetic Host, which what drives deletions from pools later on. Workers take a snapshot of the map and do not see each other’s concurrent writes.