pip: Implement preferring closest conflicting causes for backtracking resolution
What’s the problem this feature will solve?
When backtracking Pip attempts to optimize picking which package to backtrack on by first choosing packages which are the “cause” of backtracking, i.e. they have requirements which generate conflicts with other package’s requirements.
However the problem is that resolvelib doesn’t actually have a very granular understanding of “causes”, if two packages conflict on their numpy
requirement, then all packages which require numpy, even if they have a very libreral requirement, will be considered a “cause”.
This can create an an exponetial choice in package backtracking choice, and a ResolutionTooDeep error, e.g. https://github.com/pypa/pip/issues/12305
Describe the solution you’d like
The “Requirement” API that resolvelib provides does not appear to be enough to apply any filtering logic on resolvelib side.
However Pip understands it’s own requirement objects and there could implement a “narrow_causes” “filter unsatisfied names” method from the Pip resolve provider that resolvelib can call, and Pip can choose to closest causes that conflict.
Alternative Solutions
Maybe I am misunderstanding the resolvelib/Pip architecture and there is a much simpler way to implement this. Please provide feedback if you think
Additional context
I have created this branch as a rough test for this approach, there are probably more places in resolvelib to apply this method to speed things or give better quality messages:
- ~~https://github.com/notatallshaw/pip/tree/narrow-causes~~
- ~~https://github.com/pypa/pip/compare/main...notatallshaw:pip:narrow-causes?diff=unified~~
- https://github.com/pypa/pip/pull/12459
On Python 3.8 this command produces a ResolutionTooDeep error on Pip/main but solves quickly on my branch:
python -m pip install --disable-pip-version-check --dry-run --only-binary ":all:" "numpy==1.21.6" "cython==0.29.28" "scipy>=1.4.0" "torch>=1.7" "torchaudio" "soundfile" "librosa==0.10.0.*" "numba==0.55.1" "inflect==5.6.0" "tqdm" "anyascii" "pyyaml" "fsspec>=2021.04.0" "aiohttp" "packaging" "flask" "pysbd" "pandas" "matplotlib" "trainer==0.0.20" "coqpit>=0.0.16" "pypinyin" "mecab-python3==1.0.5" "jamo" "bangla==0.0.2" "k_diffusion" "einops" "transformers"
As this requires an addition to the resolvelib API I am looking for buy in for this approach from Pip maintainers before I start creating PRs across both projects.
Code of Conduct
- I agree to follow the PSF Code of Conduct.
About this issue
- Original URL
- State: closed
- Created 9 months ago
- Comments: 15 (15 by maintainers)
Small update, I’ve created the initial PR for resolvelib https://github.com/sarugaku/resolvelib/pull/145 (although it appears I need to do some work with tests).
I’ve created an experimental Pip branch that implements prefering conflict causes here: https://github.com/notatallshaw/pip/tree/prefer-conflicting-causes
I would like to try this pip resolver benchmark tool, but I don’t fully understand how it works yet, getting error with example benchmark: https://github.com/pradyunsg/pip-resolver-benchmarks/discussions/6
This looks amazing, will definitely give it a try and post some feedback over there.
This is a note to myself more than anything, this narrowing causes approach can expose bugs in the resolution algorithm in an unexpected way, i.e. the resolution finishes and there are no causes to present, without exception handeling it emits:
When this is eventually added to Pip this logic should instead emit an error to the user, probably informing them to post an issue to Pip github and maybe make a premade issue so new issues aren’t constantly created if it turns out there’s a common case of Pip incorrectly reporting resolution impossible.
No, and there’s no need to ask about that here. If there’s an update, a comment will be posted here or a cross link will be added by Github.