jest: Memory allocation failure when running lots of tests with --runInBand in limited env (Travis)
This is a specific and concrete problem (I think) related to the performance concerns I voiced in #2171.
There’s a recommendation in #1742 and #2162 to use --runInBand
when running tests on Travis (or, presumably, other slower environments), in order to avoid memory problems that cause slowdowns and failures. When I use --runInBand
on a ton of tests (for stylelint), I do avoid the problems I have without that flag; however, I just end up running into another failure. Apparently Jest consumes all the memory and crashes:
PASS src/rules/time-no-imperceptible/__tests__/index.js
<--- Last few GCs --->
115688 ms: Scavenge 1389.0 (1458.0) -> 1389.0 (1458.0) MB, 0.5 / 0 ms (+ 1.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
117175 ms: Mark-sweep 1389.0 (1458.0) -> 1388.9 (1458.0) MB, 1486.6 / 0 ms (+ 2.4 ms in 2 steps since start of marking, biggest step 1.4 ms) [last resort gc].
118452 ms: Mark-sweep 1388.9 (1458.0) -> 1344.5 (1458.0) MB, 1277.1 / 0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x3c92d50b4629 <JS Object>
2: /* anonymous */(aka /* anonymous */) [/home/travis/build/stylelint/stylelint/node_modules/esprima/esprima.js:35] [pc=0x2c59275e252] (this=0x3c92d50041b9 <undefined>,root=0x8ad703dc279 <an Object with map 0x2df09a5073e1>,factory=0x8ad703de6c9 <JS Function (SharedFunctionInfo 0x15effa467661)>)
3: /* anonymous */ [/home/travis/build/stylelint/stylelint/node_modules/esprima/esprima.js:39...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Aborted (core dumped)
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 5
- Comments: 38 (19 by maintainers)
As @bertho-zero mentioned,
jest --no-cache
does fix this, but for me it makes my tests run so much slower, it’s not a great solution.I have implemented the solution from @azz and @cpojer with success! https://github.com/facebook/jest/issues/2179#issuecomment-355231418
jest.config.js
:test/helpers/fs.js
:Hi all, Just chiming in after doing some more investigation it seems like
Runtime
is a bit out of control. And the issue might not be related to coverage. Even running a limited suite within the Jest repository quickly exposes this issue.Using the following command you can attach to the node process and take heap dumps:
I’m hoping that someone has more context as to how Runtime comes into play during a test run.
Here are my findings (last column in each screenshot is “Retained Size”):
It looks like there is a
Runtime
object created for every suite (just making an assumption based on the # of runtime objects).The
_moduleRegistry
and_resolver
properties in eachRuntime
grow quickly and are never released.These properties also contain information about modules that are not relevant to the current suite (not sure if this is by design).
The context in the majority of these objects seems to point to graceful-fs and
gracefulify
.This issue seems like a deal breaker for teams that have a large number of test. I’m hoping we can find a path to resolution soon.
Ohhhh, this is pretty interesting actually because when you
require('fs')
in Jest, you actually get graceful-fs (otherwise you’d run out of file handles pretty quickly at the scale of FB). I think the problem here may be that when you are requiringgraceful-fs
inside of a test itself, it tries to requirefs
which is a copy of graceful-fs. Maybe we should aliasgraceful-fs
to the one from the parent context.Just FYI we had a very similar issue in the Prettier repo,
graceful-fs
was pulled in viaread-pkg-up
.This commit works around the problem by aliasing
graceful-fs
to regularfs
: https://github.com/prettier/prettier/pull/3624/commits/9eb3e0ad1c0d1c115c74ce1a0179777a8073c741Ref: https://github.com/prettier/prettier/pull/3624#issuecomment-354945006
Hmm … nevermind. When running those tests I forgot to remove our
jest-setup.js
script. When I did remove that script (good news!) the heap size stayed constant at 33MB, every time.So that’s good news for Jest, suggests that our problem is somewhere in stylelint’s own files or it’s dependencies.
I think I have some evidence that the memory problem is likely within Jest, rather than stylelint, because it looks to me like the same problem occurs with very simple tests.
I created 200 test files, each with 200 test cases, and each test case nothing more than:
Running those tests, I see the same heap growth we saw in stylelint’s tests:
We have the same thing at FB, actually, so I don’t think your usage of describe/it is a problem.
Wow this is great, thank you so much for the investigation. Here is a bunch more that you should think about:
-i
, otherwise all the threads will just split the leaking and take longer to crash.-i
because the leak currently seems to happen in the worker processes and not in the main process, so if you run it without-i
, you’ll only capture the heap of the main process without any test runs. See https://github.com/node-inspector/v8-profilerFrom the looks of it it seems like one of two things, both equally likely:
vm
module – I’ve had issues with that years ago on a different service I was building at FB. For some reason an “Error” object in a vm context was retaining every single JS module at FB and crashing our service after 100 requests. This was tough to track down.One thing I thought could be an issue is code coverage – that definitely requires a ton of memory and at FB we had to bump the memory that node can consume. @dmitriiabramov did a fantastic job to rewrite code coverage support but to make it really efficient we may need support from the community to make it more “streamy”: store coverage on the file system and then process them as a stream at the end of a test run. Alternatively it may also be possible to merge coverage information in the main process after each test has finished running. I don’t know how feasible either of them is, though.
At FB we currently have one repo with about 5000 test suites and even with coverage we aren’t running into any big memory leaks which would hint at a leak issue in your code but I’m not entirely convinced and think it may be a Jest bug.