grpc-node: `@grpc/grpc-js` channel and memory leak (with a minimal reproduction)

Problem description

@grpc/grpc-js v0.6.9 seemingly leaks both memory and open network connections.

I did my best to provide a minimal possible reproduction by using gRPC directly, without any other layers on top, such as Google SDKs. Total size of the reproduction is ~50 LOC of readable TypeScript and uses @grpc/grpc-js, @grpc/proto-loader, ramda and path modules and packages exclusively. I hope it will make it easier to pinpoint and resolve an issue.

I’d appreciate if you could confirm that the issue is reproducible by following the steps below.

Reproduction steps

  1. Download reproduction-grpc.zip.
  2. Unzip the file.
  3. Navigate to the extracted directory.
  4. Run npm install.
  5. Run Firestore Emulator, either through gcloud or using a Docker container.
  6. Adjust the hostname and port in configuration.ts file if necessary.
  7. Run node -r ts-node/register --expose-gc index.ts.
  8. Heap size will be logged, see it steadily growing. Garbage collection is forced before every log statement to represent accurate results.
  9. Run lsof -i tcp:4500 -n -P, adjust the port in this script to the one which Firestore Emulator is running on. A list of open sockets will be displayed.
  10. Wait 10 seconds, repeat step 9. Wait 10 seconds, repeat step 9. Wait 10 seconds, … A steadily increasing number of open sockets can be observed.

Environment

  • OS name, version and architecture: Darwin Kernel Version 19.0.0: Wed Sep 25 21:13:50 PDT 2019; root:xnu-6153.11.26~4/RELEASE_X86_64 x86_64
  • Node version: v10.16.3
  • Node installation method: nvm
  • If applicable, compiler version: -
  • Package name and version: @grpc/grpc-js v0.6.9

Additional context

This might possibly be a root cause of https://github.com/googleapis/nodejs-firestore/issues/768.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (22 by maintainers)

Most upvoted comments

@merlinnot thank you for your awesome work isolating this 🎉

I personally don’t have a full vertical overview of all of the components and their internals, so I’m trying to help the best way I can: by providing reproductions, contributing fixes to issues I’m able to identify myself and opening issues for discussion, which most usually have a good reasoning behind them, like the one linked above.

What I understood from the internals of Firestore is that there is a pool of clients, but clients are sometimes dereferenced without properly closing them, see https://github.com/googleapis/nodejs-firestore/blob/a4efa097ef8de9ca4944356ab8767ddaa94c4188/dev/src/pool.ts.

Please see this thread (which was not opened by me), where a lot of different people experience massive memory leaks when using newer versions of Firestore. My experience is the same. I’m using VMs with 28.8 GBs of RAM for processes that I’d normally expect to run on ~2-3 GBs. This generates much higher costs. Others have the same problem: https://github.com/googleapis/nodejs-firestore/issues/768#issuecomment-554606861.

I tested the same process after mocking the library (so it didn’t even load its files) and the leak was gone, so I’m fairly certain it’s somewhere there.

At this point I can’t tell if that’s an issue with gRPC itself, Firestore or any other of its dependencies. I’m just trying to hammer out all of the leaks I see, step by step. I do understand that you have the knowledge to assess how much of an impact might a particular leak have, but I don’t. So as soon as I find one, I report, get it fixed and look for a next one.

On it! I’ll catch up with some critical tasks first and will test it straight away, ~45 minutes.