jest: Flaky failure mode when running tests
Do you want to request a feature or report a bug? Bug
What is the current behavior? Very rarely (as in we’ve only seen this happen once so far), we see a failure when running a test like so
FAIL __tests/test-file.js
● Test suite failed to run
b.js:94
/* istanbul ignore next */_react2.def
^^^
SyntaxError: missing ) after argument list
at transformAndBuildScript (node_modules/jest-runtime/build/transform.js:284:10)
at Object.<anonymous> (a.js:5:47)
Where the line number for a.js
is an ES6 import for b.js
. We have no instanbul ignore comments in our code base.
What is the expected behavior? I don’t expect this strange type of failure. Because it’s only happened once so far, I have nothing useful for reproduction unfortunately. I presume there’s a strange race condition with the transform and test execution, but I’m not sure what.
Happy to provide any more info!
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 15
- Comments: 45 (29 by maintainers)
Commits related to this issue
- Test: Disable parallel Jest test execution Because of a bug in Jest: https://github.com/facebook/jest/issues/1874 — committed to styleguidist/react-styleguidist by sapegin 7 years ago
- use --runInBand for coverage ref: https://github.com/facebook/jest/issues/1874 — committed to image-js/image-js by targos 7 years ago
- Merge pull request #58 from fga-gpp-mds/addTestForFuckingSakeIDidNotAskToBeAMasochist Add test for fucking sake i did not ask to be a masochist Believe me the tests passing, watch keep making trav... — committed to fga-eps-mds/2017.1-Cadernos-APP by fabio1079 7 years ago
- Run coverage serially: --runInBand https://facebook.github.io/jest/docs/en/cli.html https://github.com/facebook/jest/issues/1874 — committed to bbarwick-fdg/redux-bug-reporter by bbarwick-fdg 6 years ago
- Run coverage serially: --runInBand https://facebook.github.io/jest/docs/en/cli.html#runinband https://github.com/facebook/jest/issues/1874 — committed to bbarwick-fdg/redux-bug-reporter by bbarwick-fdg 6 years ago
Yeah I encountered the same thing, but very sporadically (maybe 1 in 30 or 40 test runs). It’s failing at the exact same location (
transform.js:284:10
) every timei’m still getting a lot of these errors at fb after the latest update. Seems like it’s still not fixed 😦
Also another sporadic, random test failure in transform with something that usually works
I am also getting intermittent test failures. My team first noticed this at roughly the same time as we upgraded
react
,react-dom
, andreact-addons-test-utils
from0.14.3
to15.4.0
andenzyme
from2.4.1
to2.9.1
(maybe a few days after). I’m not sure if this is related or not.We have a test suite of about 2800 tests (not all are testing react components though). At least one test fails very rarely on local workstations, but fairly often on our CI server. It does seem to be related to load on the workstation or server.
Here is an example of an error message:
Method “simulate” is only meant to be run on a single node. 0 found instead.
Which is an enzyme error thrown when you try to run.simulate('click')
on something that doesn’t exist.In this particular example, we run:
wrapper.find('li').simulate('click');
wherewrapper
is anenzyme.mount
object, and get that error.However, when I
console.log(mountedDOM.debug());
, theli
is clearly there.Most, if not all, of the tests that fail intermittently are trying to interact the an
enzyme.mount
object.Hopefully this provides some more data that will help diagnose this. We aren’t really sure if it’s
react
,jest
,enzyme
, something in our project config, or something environment related that is causing this problem.I think we could release a new revision for
jest-runtime@20.0
so that we get the fix everywhere. I’m a little stressed this could cause regression, but at worst we can always revert.That doesn’t look like the original issue. It’s something different, related to jsdom. The original issue was due to filesystem write race condition. I suggest creating another issue and renaming this one to “Corrupted compile cache”.
Any update on this? I am getting intermittent failures on circleci (seems to be during times of load and always when max workers are set to 1). Tried updating to 20.1.0-echo.1 but does not seem to be working.
Have to rebuild several times to get tests to work even though they work every time on laptop. Any help would be appreciated.
Please feel free to upgrade to jest@20.1.0-delta.3 which should contain a fix for this issue thanks to @jeanlauliac.
@Vanuan We have Docker on CI — everything runs in its own container so there are not multiple processes writing to the same file. And we do see this on CI frequently.
@jeysal we’re still running Jest v23 and we continue to use --runInBand on CI. If I get a free minute I can try out v24 and see if we still get failures in CI when running in parallel
I have the issue on my repository, it happens very very often:
https://travis-ci.org/neoziro/recompact/builds/186529118
Not in local, but on travis, a race condition. I will try to fix it in jest, hints are welcome.
@mcous thanks for the update 😃 I’ll close this then, hope nobody is running into this kind of problem anymore
@jeysal sorry for the delay here. We (finally) upgraded to v24 and removed
--runInBand
in CI. After running for a few weeks we have not seen the flaky failures we used to get@jeanlauliac - Thank you so much for fixing this! It’s been broken for a while and I still sporadically hit the same problem, so I really appreciate it 😃
Yes. Or more like, if a process B is reading a file while A is writing, then B will get a corrupted/partial file in memory. That could easily explain syntax errors that are observed.
There is no need for a mutex to read/write to the cache, because we can leverage rename(3) (and equivalents) as being an atomic operation. Writes should be atomic, and reads should verify the validity of the data against a hash.
Using a lock like in #3561 can work too but being more complicated, is susceptible to additional concurrency issues. Besides, reading the data also needs to lock the same mutex, because you want to make sure you read fully-written files. Regardless, a checksum needs to be added to verify the validity of the data.
Probably the best thing we can do now is
--no-cache
. But there should be a comparison whether--runInBand
is slower.I found a workaround to fix it. Force execution in the same thread using
-i
or--runInBand
option.https://facebook.github.io/jest/docs/troubleshooting.html#tests-are-extremely-slow-on-docker-and-or-continuous-integration-server