pants: Pants test timeout includes venv creation overhead causing timeouts
Describe the bug
When a python_tests has heavy dependencies (e.g. pytorch) a test could timeout on a “cold” execution of the test, as during first execution of the test some additional overhead actions are performed to create the venv. This can be significant, we observe a ‘test time’ >30s for a test that takes 3s with a warm cache.
One solution would be to exclude this additional venv creation time for the test timeout, and only measure the actual execution time of the python process. This would make the timeout argument for python_tests more useful when large complex dependencies are used.
Not sure if there is a performance component that can be solved, potentially that should then be an issue in the pex repository.
Pants version 2.16.0 and 2.17.0rc5
OS MacOS + Linux
Additional info See gist for example: https://gist.github.com/JoostvDoorn/9c0f63ed5198544a36b477502eeac4fb
To test:
rm -rf ~/.cache/pants/named_caches/
pants test :: --keep-sandboxes=on_failure --test-force
✓ //test_file.py:root succeeded in 23.09s.
Second execution using pants test :: --keep-sandboxes=on_failure --test-force is significantly faster:
✓ //test_file.py:root succeeded in 3.12s.
The bug here is that the test execution timing should reflect the actual time it takes to execute the test, and not the creation of the virtual environment.
For the gist example the expectation is that would always work regardless of the state of the cache:
python_tests(
name="root",
timeout=15,
)
About this issue
- Original URL
- State: open
- Created 10 months ago
- Comments: 28 (24 by maintainers)
Commits related to this issue
- Update to Pex 2.1.155 (#20347) All changes: - https://github.com/pantsbuild/pex/releases/tag/v2.1.153 - https://github.com/pantsbuild/pex/releases/tag/v2.1.154 - https://github.com/pantsbuild/pe... — committed to pantsbuild/pants by huonw 6 months ago
- Upgrade to PEX 2.1.156 (#20391) https://github.com/pantsbuild/pex/releases/tag/v2.1.156 Continuing from #20347, this brings additional performance optimisations, particularly for large wheels lik... — committed to pantsbuild/pants by huonw 6 months ago
Ok, the lion’s share (~90%), of the remaining time is creating the (compressed)
--layout packedwheel zips.So the lion’s share of the overall time is dominated by the long pole of the download of torch in a parallel download of all needed wheels and the zip compression of packed wheel zips. The latter is serial; so that could be parallelized - but stepping back and trying a native zip of just the torch installed wheel shows:
So there is not much to be gained there in this case.
Ok, makes sense from a clean Pex cache. I’m not sure there is any way to speed up venv creation from a cold start save for populating the venv site-packages in parallel. That step is currently serial. I can give that a quick try in the am and report timing differences for your requirement set case in the gist.