apollo-client: Memory leak on SSR after upgrading from 2.x to 3.3.13
Recently decided to upgrade apollo client from 2.6 to 3.3.13. After releasing it to production we saw huge memory usage increase on a server side. Rolling back to the old 2.6 version solved the problem.
Here is what happened in production while switched to 3.3.13 (tried to solve memory leak problem), gave up after 3 days and rolled back to 2.6:
Here is what we saw in the heapdumps (comparison view, filtered objects that were not deleted):
Seems like a lot of “Entry” instances are not garbage collected (deleted 0). Here is a dependency tree, maybe someone can help to see where the problem is:
Any help will be appreciated 🙏
Heapdumps:
- https://drive.google.com/file/d/10DnUSdyXN5q030bbcVXmgNpWWcUeVfy0/view?usp=sharing
- https://drive.google.com/file/d/1a05z2oty1i5foYvBkHK_T6uWdHwdPmw4/view?usp=sharing
Shortly about our setup:
- SSR (Node.js, React, apollo)
- Server side code is bundled with webpack
What we have tried:
resultCaching:false
config for the inmemory cache- Manuall call
client.stop()
client.clearStore
after each request. - Removed all
readFragment
calls. - Different node.js versions 12.8, 14.16
- Increased max old space size (heap size)
- Downgrade to apollo 3.1.3
Versions
- OS - alpine3.11
- @apollo/client - 3.3.13
- node.js - 14.16.0
- react - 16.8.2
- webpack 4.35.2
- graphql": 15.5.0
- graphql-tag": 2.12.0
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 41
- Comments: 49 (12 by maintainers)
Commits related to this issue
- Failing test of RenderPromises QueryInfo garbage collection. This is my best guess at a reproduction of issue #7942, given the heap snapshot traces provided by @AlexMost. — committed to apollographql/apollo-client by benjamn 3 years ago
- Call renderPromises.clear() at end of getDataFromTree work. Should fix #7942 and the tests I added in my previous commit. — committed to apollographql/apollo-client by benjamn 3 years ago
- Avoid collecting more data in stopped RenderPromises objects. Testing a theory related to issue #7942 that QueryData hangs onto the context object and might attempt to call context.renderPromises met... — committed to apollographql/apollo-client by benjamn 3 years ago
- Clear `InMemoryCache` watches when `cache.reset()` called. This should help with memory management during SSR, per my comment https://github.com/apollographql/apollo-client/issues/7942#issuecomment-9... — committed to apollographql/apollo-client by benjamn 3 years ago
hello, is this issue still present on the latest version? I couldn’t find any release with a fix for this
Tried
fetchPolicy: no-cache
this one reduces memory and CPU consumption significantly, but memory leak is still present 😢Currently, we have test setup with apollo3 with mirrored traffic from production, so I can check any hypothesis very quickly, any suggestion or help will be much appreciated!
The problem for us was that the
stringifyCanon
andstringifyCache
globals inobject-canon.js
grow without bound even when caching is disabled. The fix was to callApolloClient.clearStore
whenever we create the client:Perhaps this can help: I ran into a possibly related issue using Next.js where I set things up so that the server does not initialize a new client for each request except for within the
getServerSideProps
andgetStaticProps
functions (because I wanted to see if I can get things to work that way). The idea was that, ingetServerSideProps
andgetStaticProps
we clearly need a fresh ApolloClient instance for each request, but within theApp
whereApolloProvider
is rendered I wanted theclient
prop to be a client instance that’s created only once (call this the ApolloClient with the frontend configuration, i.e.,ssrMode: false
). However, SSR requires creating an instance of an ApolloClient with the frontend configuration to render the app server-side, but it doesn’t actually use this particular ApolloClient instance for anything because queries are executed as an effect, not during initial rendering. This also means that, in my setup, even though the same ApolloClient instance on the server side is reused across requests, it holds no cached data and therefore can’t leak data from one user’s request to another’s, which I was at first worried about. Ok, I was pleased. However, as I inspected the client’scache
property across requests using a debugger, I noticed that thewatches
set kept growing indefinitely. For every page load, all the queries that were created during initial rendering were added to thiswatches
set. This was my configuration of the ApolloClient meant for the frontend but created by the server for SSR where I was getting the memory leak:But when I changed it to the following, the
watches
stopped accumulating:In short, this may be entirely different from what this GH issue is about, but that’s not really clear. I’m finding that an ApolloClient instance may be used across requests so long as the cache is disabled with
fetchPolicy: 'no-cache'
.Hi everyone!
At this point, this issue is covering quite a few different behaviors and bugs - some of which have been fixed, some might still be open. To be honest: tickets like this become very hard for us maintainers to take action on, and every time we start debugging with a user, everyone else gets spammed with messages that don’t really impact their problem.
So I’m going to close this issue and want to ask everyone still experiencing memory-related problems to open a new, individual ticket.
That will make it much easier for us to track and take action on the individual problems.
Thank you for your understanding.
One last thing I want to add to this ticket, because it might be useful for a bunch of you:
If you see an increase in memory consumption, that might not necessarily be a memory leak!
Apollo Client internally uses
WeakMap
quite a lot, to store “data related to objects”. ThecanonicalStringify
mentioned up there is one example of that. Once the original object is garbage-collected, theWeakMap
also will mark the “metadata” we assigned with it for garbage collection - but when excactly it is actually garbage-collected is outside of our control. That means that if node sees a lot of free memory, it might just decide that garbage collection is not a priority right now, and delay that. As soon as memory reaches some internal limit, that memory will be garbage-collected though.You can compare that with a Linux Server that will over time always reach 100% memory consumption, because it just keeps things like disk read cache around, as that is cheaper than actively removing it from memory - but as soon as more memory is required, things will be garbage-collected, and there will always be enough free memory.
You can test if that is the case in your scenario by starting
node
with--expose-gc
and manually callingglobal.gc()
. If after that, memory usage drops to an expected level, you don’t have a memory leak, but just a lazy garbage collector.Great news 🎉 finally we have updated apollo-client to
3.6.0
.Big thanks to @matthewnourse suggestion - https://github.com/apollographql/apollo-client/issues/7942#issuecomment-1057517664! That thing fixed memory leak for us:
So, somehow, calling
clearStore
right after the client creation fixes the memory leak issue.Hi, are there any news regarding this topic??
Was looking through the objects inside the heapdump, filtered objects by “Retained size” and found one strange thing, maybe this will help. Significant amount of space is reserved by ObservableQuery instances, and one of them is way much bigger than others
5 487 912
.Here is flagsValuesQuery that was insidte this object (maybe this will help)
Hi @benjamn! Thanks for the update, going to test
@apollo/client@3.4.0-beta.22
ASAP. As for the 🗑️ button, we usedheapdump
package to collect heapdumps from kubernetes pods. Can’t actually use devtools there.Hi @benjamn! We are following this recommendation, every request creates its own apollo client and InmemoryCache instances. Actually, you can see that there are 57 Apollo Client instances in the heapdump. Also there are +1k MissingFieldError new instances. I tried to investigate that, found an issue with readFragment, tried to remove all readFragment usages but that didn’t help.
I use v3.7.14, the problem has not been observed since May.
Could you please share it to all?
Hi @hwillson! Hadn’t chance to check newer versions of the apollo client, going to check that this or next week, will inform you about the results ASAP
Tried
3.3.14
, unfortunately it still leaks 😿Heapdumps:
https://drive.google.com/file/d/17JIq8fkbfWaSzYKWaebDOjrHCQQRahr6/view?usp=sharing https://drive.google.com/file/d/1DUnZRqss2eE1mLg7iyKs14NcM9DGbjW5/view?usp=sharing
@benjamn @hwillson ok, I will double-check that. But, for sure I used patched version of getDataFromTree. Will let you know when I will have results with 3.3.14
@AlexMost Adding to @hwillson’s comment, it’s possible you’re still using the v3.3.13 code in Node.js despite changing the
getDataFromTree.js
module, since Node uses the CommonJS bundles rather than separate ESM modules. Worth double-checking afternpm i @apollo/client@3.3.14
, I think!@gbiryukov we are using this approach. Each request creates a new client instance and cache. Sorry if that wasn’t clear from my previous messages.