pants: Interpreter constraints do not play well with lock files.
Say I have interpreter constraints “CPython>=3.6,<4” in-play and I want to generate a lockfile for a resolve. There needs to be a lockfile per-interpreter these constraints select since requirements can have environment markers and these can cause an interpreter-specific resolve. Since environment markers are so broad in scope, patch version of interpreters can have different resolves and even the same patch version and same platform can have a different resolve due to fields like platform_version
and platform_release
: https://www.python.org/dev/peps/pep-0508/#environment-markers
Put more simply, clearly if I have a concrete interpreter in-hand I can run a resolve for it and then generate a lock file for it. If I don’t, I can’t *.
With interpreter constraints I may only have a subset of possible interpreters though and so I can only generate a subset of the needed lockfiles. In the leading example, say I try to create the lockfile on a machine with just CPython 3.6.5 on June 13th 2021. I will generate a lockfile for that interpreter and check it in. Now, say, 2 months later I go back to that commit to re-build things, but on a machine with only CPython 3.9.1. I have no lockfile, and so it will need to be regenerated. The problem here is that the lockfile may not be close at all to the one generated 2 months ago. Many new versions of distributions may have been published and this can result in the new CPython 3.9.1 lockfile behaving quite differently than the CPython 3.6.5 lock file. Towards the worst end of this, the new behavior could be buggy or broken.
- Technically there may be a way to perform a much too large resolve that ignores environment markers and some wheel tags to collect all possible distributions needed for all possible interpreters in an IC range.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 19 (19 by maintainers)
Commits related to this issue
- Add experimental tool lockfiles for Black, Isort, Yapf, Coverage.py, Lambdex, and Protobuf MyPy (#12357) These tools are simple to support: they run entirely independently of user code, and they get ... — committed to pantsbuild/pants by Eric-Arellano 3 years ago
- [Internal] Add `InterpreterConstraints.partition_by_major_minor_versions` (#12371) To robustly handle lockfiles and interpreter constraints (https://github.com/pantsbuild/pants/issues/12200), we shou... — committed to pantsbuild/pants by Eric-Arellano 3 years ago
- [internal] Use Poetry (for now) to generate lockfiles rather than pip-compile (#12549) **Disclaimer**: This is not a formal commitment to Poetry, as we still need a more rigorous assessment it can ha... — committed to pantsbuild/pants by Eric-Arellano 3 years ago
This example I have laying around only has 1 lock, but I think you get the idea:
Not quite:
lint
/test
use https://github.com/pantsbuild/pants/blob/b82a01ed0bffb5df09a877528efff4c44a6206a8/src/python/pants/backend/python/util_rules/pex.py#L125-L128, which has Pants choose a single interpreter to use in order to bypass that fanout.Agreed on the rest.
You shouldn’t be. Pex already resolves for every discovered interpreter on a machine that fits an IC range today (in parallel). We should be able to hit exactly that perf profile here too.
Maybe, but I think there is no way to avoid the fact that there simply will be use cases that use large ranges. We cannot tell those folks “don’t do that”.
I really think we should do post-resolve processing of dist-info/METADATA and use Requires-Python and Requires-Dist metadata to exactly determine the breadth of validity of a lock. I derailed things a bit with the “cover the range” terminology. Its actually about covering the environment range as selected by environment markers. Its just that the most common environment marker to use only picks out python minor versions (
python_version
). Whether we warn or fail can be debated, but certainly that can be an Pants option.Yes. Alot of design effort has been focused on UX - there are actual thorny fundamental does it even work issues to sort out though before even getting to that. A swing in focus is needed to make sure we ship something that at base works.
Using #12312 for illustration on Pants itself, I want to highlight what your declaration of equivalency actually means in practice for a - I assume we agree - “reasonable” IC range - Pants’ itself:
Afaict this is no bueno. Not from a locked resolve standpoint (I don’t want dep drift to shoot me in the foot tomorrow) and certinly not from a security (supply chain) standpoint.