resolvelib: “Learn” to avoid trying routes that do not affect the dependency graph
pip install "apache-airflow[devel_ci]==1.10.14rc1" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1-10/constraints-3.6.txt"
This (run against https://github.com/pypa/pip/commit/fd8ddb6e55086500b1d836b07365ba7404f53e3b) would result in the resolver backtracking on setuptools, but these backtracks are obviously useless since setuptools does not have dependencies.
The resolver should be able to “look ahead” and see that trying other versions of setuptools would not change the conflicting graph, and skip working on them altogether. This applies not only to packages without dependencies (although they are the easiest target), but also versions that have the same dependencies. The idea of “the same dependencies” however require provider support (a new hook are_dependencies_equal or something) so those are more difficult to implement.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 49 (33 by maintainers)
Haven’t been able to test it’s performance yet as it seems to be getting tripped up easily over non-compliant requirements, or at least things it thinks are non-compliant, of real world packages, I filed some issues:
Yes, I was a heavy users of apache-airflow when pip 20.3 was released which is what got me started trying to understand how pip’s resolver works. Why I’m using
apache-airflow[all]==1.10.13as a test case is I don’t believe anyone had got pip to ever run long enough to get to aResolutionImpossibleerror, so it has been my benchmark to see if this approach is helpful for “real-world” cases.Well in terms of static metadata from wheels I will point your attention to PEP 658 / 714 which is live for new packages on Pypi, and my understanding is will be backfilled: https://discuss.python.org/t/pep-658-714-are-now-live-on-pypi/26693
FYI my understanding is that Poetry’s resolution algorithm is based on PubGrub and while it works well for most cases, at least the way Poetry uses it it does have lots of real world situations that it gets stuck resolving for hours. But I don’t keep a super close eye on Poetry, maybe things have got a lot better. Either way they do have a CDCL SAT solver that works with Pypi, so if you haven’t perhaps worth looking at their implementation.
Hi @notatallshaw we’re definitely not sure yet, how to do it.
Obviously the fact that many Python packages ship static metadata in wheel files is a big plus for us, so the worst case solution would be to build some sort of static index that only works with wheel files.
However, as the name of CDCL implies, clauses can (and are) added on the fly already (learnt clauses). Potentially this can be generalized to support adding dependency clauses on the go as well.
PubGrub seems to have this figured out, and conceptually that solver is not so far away from ours (the key difference is that PubGrub quite strongly suggests using ranges for version constraints, while we don’t really care about the shape of the constraint).
I’ve also asked ChatGPT about this problem, and it came up with a bunch of existing “iterative” SAT solvers (ISAT, CryptoMiniSAT, or Glucose). Wether this is true I can’t judge yet, I looked at CryptoMiniSAT for a second, and it seemed to indeed support adding clauses on the fly.
This makes me hopeful that we could also find some theoretical background for iterative SAT solving 😃
Current status of this project is: working on the parts to retrieve metadata from PyPI’s API. Once we’ve wired things up will let you know about the progress!
Hey all, we’ve ported resolvelib to Rust (https://github.com/prefix-dev/resolvelib-rs) but for resolving conda packages still run into some performance issues that we don’t see with
libsolv. For that reason, we will probably spend some time to implement some version of CDCL inresolvelib-rs. I’ll update you here if it works out and gives performance benefits!Trying this command on my experimental version of Pip, as well as 21.3 and 21.2.4 I get the error for all three:
I’ve been mulling on this idea and I think I have devised a simple(ish) approach to implementing an analogous step to the “Non-chronological back jump” in CDCL and how that might be used to speed up resolution. Once #84 and https://github.com/pypa/pip/pull/10481 land I’m going to start experimenting and see if I can actually implement it and if there are really use cases it can improve.
By definition it will actually change the behavior of the resolver itself, not just give the downstream library the opportunity to choose a different path, so if I can implement it and show it helps in some cases it’s still probably going to take even more agreement that it’s a good approach.