azure-sdk-for-js: [Cosmos DB] Memory leak
- Package Name: @azure/cosmos
- Package Version: 3.14.1
Describe the bug
A memory leak was introduced between versions 3.9.2 and 3.14.1. We know this because we started noticing this pattern on the memory consumption of our web apps, after just bumping the library. We deployed the update on November 20.
To Reproduce
Run the following snippet twice: first with version = '3.9.2'
and then with version = '3.14.1'
. It will run the same query 100 times and take heap snapshots before and after:
const version = '3.9.2';
// const version = '3.14.1';
const { CosmosClient } = require(`./cosmos-${version}/node_modules/@azure/cosmos`);
const heapdump = require('heapdump');
async function doit() {
const client = new CosmosClient({
endpoint: 'ENDPOINT',
key: 'KEY'
});
const container = client.database('people')
.container('people');
const res = await container.items.readAll().fetchAll();
return res.resources.map(p => p.id);
}
async function main() {
for (let i = 0; i < 5; i++) {
await doit();
}
heapdump.writeSnapshot(`dump-${version}-1.heapsnapshot`);
for (let i = 0; i < 100; i++) {
await doit();
}
heapdump.writeSnapshot(`dump-${version}-2.heapsnapshot`);
}
main()
Here are the results: heaps.zip
3.9.2 - Nothing extraordinary
3.14.1 - A lot of x100
allocations, indicating a leak
Expected behavior
No leaks.
Additional context
We did further analysis and found the culprit commit: https://github.com/Azure/azure-sdk-for-js/commit/d3647f86a9f8e29f6376ef1e04015dd5e94d63a7, which introduced the issue in 3.12.0. This commit quietly changes the documentation on connectionPolicy.enableEndpointDiscovery
to say:
Flag to enable/disable automatic redirecting of requests based on read/write operations. Default true. Required to call client.dispose() when this is set to true after destroying the CosmosClient inside another process or in the browser.
This points to a workaround we can use today: to call client.dispose()
on clients we no longer need. I verified that it does indeed avoid the leak. We will do that immediately on our service.
But the fact remains:
- This option should not be the default as it stands since it causes a severe leak;
- Or, alternatively, this dramatic change in behavior would warrant a major version bump and documentation highlights to remember user to call
client.dispose()
.
Note that today there is no reference to the client.dispose()
method on the npm page or the Getting Started guide.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 8
- Comments: 16 (8 by maintainers)
@sajeetharan what is it related to? Can you forward this issue to the right location? This is a high quality report that identifies a key problem and we should not lose track of it.
I think there should at least be better highlighting of this issue in the samples/README. Eventually this might be a good use case for this proposed ECMAScript feature: https://github.com/tc39/proposal-explicit-resource-management
Here’s the outcome of applying the workaround:
Oh hi @xirzec! I didn’t even click the profile to see who this was! Long time no see!
Relevant comment: https://github.com/Azure/azure-sdk-for-js/pull/15781#discussion_r655670690
cc @xirzec