poetry: Poetry is extremely slow when resolving the dependencies
- I am on the latest Poetry version.
- I have searched the issues of this repo and believe that this is not a duplicate.
- If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).
- OS version and name: Centos 7
- Poetry version: 1.0.0
- Link of a Gist with the contents of your pyproject.toml file: https://gist.github.com/qiuwei/a0c7eee89e5e8d75edb477858213c30b
Issue
I created an empty project and run poetry add allennlp. It takes ages to resolve the dependencies.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 395
- Comments: 271 (42 by maintainers)
Links to this issue
Commits related to this issue
- poetry: support Python 3.10 Bleak 0.13 adds support for Python 3.10 (and fixes major memory leak on Windows). Limiting Python version to <3.11 might also help speed up poetry update according to som... — committed to pybricks/pybricksdev by dlech 3 years ago
- Update how we write dependencies to speed-up poetry (#66) CI has been failing for the last 3 weeks due to poetry not being able to resolve dependencies in a reasonable amount of time / failing whil... — committed to archspec/archspec by alalazo 3 years ago
- Update how we write dependencies to speed-up poetry (#66) CI has been failing for the last 3 weeks due to poetry not being able to resolve dependencies in a reasonable amount of time / failing whil... — committed to archspec/archspec by alalazo 3 years ago
- Update how we write dependencies to speed-up poetry (#66) CI has been failing for the last 3 weeks due to poetry not being able to resolve dependencies in a reasonable amount of time / failing whil... — committed to archspec/archspec by alalazo 3 years ago
Yes, i’m running into the same problem. Resolving dependencies takes forever. I tried to use VPN to get through the GFW, nevertheless, it is still not working. I also tried to change pip source and wrote local source in the toml file, neither works. It’s driving me nuts.
First, it’s capitalized PyPI.
Second, there is no way for PyPI to know dependencies for all packages without executing arbitrary code – which is difficult to do safely and expensive (computationally and financially). PyPI is run on donated infrastructure from sponsors, maintained by volunteers and does not have millions of dollars of funding like many other language ecosystems’ package indexes.
For anyone interested in further reading, here’s an article written by a PyPI admin on this topic: https://dustingram.com/articles/2018/03/05/why-pypi-doesnt-know-dependencies/
No conflict. Poetry is slow as hell.
I’m currently using this workaround:
It takes a lower time space to install the package locally since all deps are already installed.
Make sure to run
poetry shell
before to access the created virtual environment and install on it instead of on user/global path.First of all, I want to say there is ongoing work to improve the dependency resolution.
However, there is so much Poetry can do with the current state of the Python ecosystem. I invite you to read https://python-poetry.org/docs/faq/#why-is-the-dependency-resolution-process-slow to know a little more about why the dependency resolution can be slow.
If you report that Poetry is slow, we would appreciate a
pyproject.toml
that reproduces the issue so we can debug what’s going on and if it’s on Poetry’s end or just the expected behavior.@gagarine Could you provide the
pyproject.toml
file you are using?Poetry being slow to resolve dependencies seems to be a reoccuring issue:
A personal Heroku app is not going to be as valuable a target as PyPI would be. Neither is a $10/month Heroku app going to be able to support the millions of API requests that PyPI gets everyday. The problem isn’t in writing a script run a setup.py file in a sandbox, but in the logistics and challenges of providing it for the entire ecosystem.
“It works 90% of the time” is not an approach that can be taken by the canonical package index (which has to be used by everyone) but can be taken by specific tools (which users opt into using). Similar to how
poetry
can use an AST parser for setup.py files which works >90% of the time, to avoid the overhead of a subprocess call, but pip shouldn’t.Anyway, I wanted to call out that “just blame PyPI folks because they don’t care/are lazy” is straight up wrong IMO – there are reasons that things are the way they are. That doesn’t mean we shouldn’t improve them, but it’s important to understand why we’re where we are. I’m going to step away now.
Maybe a stepping stone to a solution could be to add a flag to show some more info regarding dependency resolution - e.g. for each package, how long it took, and issues encountered/processes used. This would at least let us see where slowdowns are coming from, and potentially let us send PRs to other projects to provide better/more appropriate metadata?
Hey dudes - as Sebastian implied, the root cause is the Python eco’s inconsistent/incomplete way of specifying dependencies and package metadata. Unfortunately, the Pypi team is treating this as a
wont fix
.In particular, using the Pypi json endpoint, an empty dep list could either mean “no dependencies”, or “dependencies not specified”. The Pypi team doesn’t want to differentiate between these two cases for reasoning I don’t follow.
The soln is to workaround this by maintaining a sep cache from Pypi that properly handles this distinction, and perhaps refuse to use packages that don’t properly specify deps. However, this latter aspect may be tough, due to long dep nests.
Python’s grown a lot over the decades, and it has much remaining from its early days. There’s a culture of no-breaking-changes at any cost.
Having to run arbitrary python code to find dependencies is fucked, but … we can do this for each noncompliant package, and save it.
this has gotten much worse recently, not sure what happened
Before you step away - Can you think of a reason PyPi shouldn’t differentiate between no dependencies, and missing dependency data?
If going through existing releases is too bold, what about for new ones?
Hi. I resolved my problem with long resolving dependencies.
which python
)If you have problems with numpy, psycogg or other binary libraries on mac m1. Research problems with install the dependency like
install <dep> m1
. Most resolvers is command prefixarch -x86_64
I’m on Fedora 34, send help
Resolving dependencies... (15134.3s)
Hi, PyPI admin here, just want to clear a few things up.
There’s no rate limits for the JSON API or Simple API. Additionally, our rate limits are just rate limits, they immediately 429 when the limit is hit. They don’t slow down the request.
We haven’t implemented any rate limiting at the CDN layer (Fastly) and I have no reason to believe they are doing any rate limiting on our behalf without us realizing it.
I strongly suspect due to the lack of similar complaints from other installers and other consumers of our JSON APIs (there are a significant amount) that this is something specific to Poetry, but I don’t have any other ideas on what it might be.
I think the root cause is Python’s been around for a while, and tries to maintain backwards compatibility. I agree -
setup.py
isn’t an elegant way to do things, and a file that declares dependencies and metadata is a better system. The wheel format causes dependencies to be specified in aMANIFEST
file, but there are still many older packages that don’t use this format.As a new lang,
Rust
benefited by learning from the successes and failures of existing ones. Ie it has nice tools like Cargo, docs, clippy, fmt etc. It’s possible to to implement tools / defaults like this for Python, but involves a big change, and potentially backwards-incompatibility. There are equivalents for many of these (pyproject.toml
,black
etc), but they’re not officially supported or widely-adopted. Look at how long it took the Python 3 to be widely adopted for a taste of the challenge.It seems so. I have checked the detailed log, poetry kept retrying to resolve the dependency for botocore, but without success. So I assume that the dependency can be eventually resolved if enough time is given.
However, is there any way to get around this?
BTW, I also consider it’s better to give some warning if there are some dependencies are not properly specified and could not be resolved after a number of attempts.
Disabling IPv6 on MacOS fixed the issue. System Preferences > Network > Advanced > TCP/IP Tab > Set “Configure IPv6” to “Link-local only”.
Just to add another data point to the conversation. Running
poetry update
on many of our projects now takes > 15 minutes.I understand that doing a comparison between pip and poetry install is not an apples for apples comparison, and also that there are many variables outside poetry’s control - however it is hard to believe that 15 minutes resolving a small number of dependencies is unavoidable.
I created a vaguely representative list of dependencies for our projects and put the identical deps in both a
pyproject.toml
(see https://gist.github.com/jacques-/82b15d76bab3540f98b658c03c9778ea) andPipfile
(see https://gist.github.com/jacques-/293e531e4308bd4d6ad8eabea5299f57).Poetry resolved this on my machine in around 10-11 minutes, while pipenv did the same in around 1 - 1:15 minutes. This is a roughly 10x improvement.
Unless I’m missing a big part of the puzzle here both pipenv and poetry are doing similar dependency resolution, and are working from the same repositories, so there is no external reason the performance should be this different. Would be great to see prioritising this issue and some of the proposed fixes that are ready to merge e.g. https://github.com/python-poetry/poetry/pull/2149
P.S. thanks for making an awesome tool, poetry has made our lives better since we started using it!
I noticed the same
pyproject.toml
environment was taking 5000+ seconds on local macOS vs 120 seconds on cloud-hosted Ubuntu. I’ve written up a detailed investigation here, here are two ways to alleviate your Poetry woes:poetry update
makes a lot of API calls to PyPI and has to download certain packages to resolve their dependencies (sometimes for good reasons).Other points of note:
poetry update
go from 100s to 600s.Having read through most of this thread 2-3 times now, it sounds like a way forward would be to identify packages that don’t distribute a wheel, and fix them to include said wheel. It also sounds like it wouldn’t be completely perfect, but apparently would improve the situation quite a lot. https://pythonwheels.com/ has a short explanation of how to do this under “My package is white. What can I do?”.
I’m wondering how hard it would be to write a little bot that creates an automated PR (focusing on packages whose code is on github for now) with a fix that includes a wheel in the distribution. In cases where the fix is too hard to create automatically, the bot could open an issue with links to good explanations about how to improve the package and why it matters. Has anyone tried something like this already?
geopandas seems to take a particularly long time. This was on a new project as the first dependency:
Hi,
I would like to invite everone interested in how depedencies should be declared to this discussion on python.org
fin swimmer
I’m new to (more serious) python and don’t understand the big drama. Yet
setup.py
seems a powerful and very bad idea. Dependency management is terrible in python, because ofsetup.py
?Can someone post a couple of examples where a txt file is not enough and
setup.py
was absolutely necessary?Cargo do it like this: https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#platform-specific-dependencies this is not enough for python?
Why poetry does not create their own package repository, avoiding setup.py and using their own dependency declaration? Could take time… but a bot can automatise the pull request on most python module based on the kind of technics used in https://github.com/David-OConnor/pydeps
It’s not as tough as you imply.
You accept some risk by running the arbitrary code, but accepting things as they are isn’t the right approach. We’re already forcing this on anyone who installs Python packages; it’s what triggers the delays cited in this thread.
I have the above repo running on a $10/month Heroku plan, and it works well.
I’ve made the assumption that if dependencies are specified, they’re specified correctly, so only check the ones that show as having no deps. This won’t work every time, but does in a large majority of cases.
Related: Projects like Poetry are already taking a swing at preventing this in the future: Specifying deps in
pyproject.toml
,Pipfile
etc.The master branch (and therefore upcoming 1.2.0 release) puts a timeout on all requests.
That’s not going to solve anyone’s connectivity problems for them, but should make it clearer when that is what the problem is. Better, anyway, than hanging indefinitely.
I used to pick Poetry over pipenv because it was installing dependencies A WAY FASTER. Sadly, this has changed over the time and now I’m getting unresponsive endless installations over the over again (the second time in two days):
Yes, I have acknowledged about this https://python-poetry.org/docs/faq/#why-is-the-dependency-resolution-process-slow and I don’t think that letting users wait undetermined long period of time is the correct response to this challenge.
I would prefer to be informed that Poetry saw some issues with deps I would like to install and let me opt-out from this indefinite version guessing. Otherwise, Poetry becomes effectively unusable.
I worked around this issue by disabling IPv6 as explained here: https://stackoverflow.com/questions/50787522/extremely-slow-pip-installs
Tried out the newest release! Much much much faster now!
Well, “slow” is an understatement, I left
poetry update
running overnight and it’s still going at 100% CPU and using 10.04 GB of memory:That’s the extent of it - it’ll install sub dependencies, but whichever one you install last wins. Tools like Poetry, Cargo, npm, pyflow etc store info about the relationship between all dependencies, and attempt to find a solution that satisfies all constraints. The particular issue of this thread is that the python ecosystem provides no reliable way of determining a packages dependencies without installing the package.
@finswimmer I check the discussion. Seem like they are reinventing the wheel instead of copy/past something that works (composer, Cargo, …).
For sure requirements.txt is not good.
Yes. But why making poetry if it’s not to replace PyPI and requirements.txt?
If poetry is compatible with PyPI, there is no incentive to add a pyproject.toml. Perhaps I don’t even know I should add one. Now if every time I try to install a package that has no pyproject.toml the command line proposes me to open an issue on this project with a ready to use a template, this could speed things up.
this fixed my issue:
poetry cache clear --all pypi
Here is my attempt at a minimal reproducible example.
I cleared the poetry cache:
poetry cache clear --all pypi
I created a new poetry projcet:
poetry new speed
I added these lines to the end of
pyproject.toml
:Toggle Full pyproject.toml
I ran
py-spy record --idle --threads --subprocesses --format flamegraph --output flamegraph.svg poetry update -- -vvv
multiple times, canceling each time after several seconds of inactivity and saving each successive run to a different flamegraph image.Toggle Full poetry -vvv output
For all of us using private repositories, I found issue #4035 and made a PR for it in #4353. Even with @ls-jad-elkik’s suggestion, dependency resolution is still slow since poetry will still check both on PyPi and in any
secondary = true
repositories for each package.Pip doesn’t have dependency resolution.
You seem to confuse multiple parts of the ecosystem. I would distinguish those entities:
Under the hood, I think, poetry uses a couple of those base tools. It is just meant to show a more consistent interface to the user.
I found that
poetry update
was being very slow especially when I defined a secondary source.I only wanted to use the secondary source for my company’s proprietary packages.
I think (not sure) that poetry is calling out to secondary source when not necessary.
my setup was e.g.
https://github.com/python-poetry/poetry/blob/a3aafa840de950f81e99553e52190a54c94d6bce/docs/cli.md
(but not documented here: https://python-poetry.org/docs/dependency-specification/ )
Explicitly defining source per package seemed to have a big help! Although it’s unfortunately verbose.
Hopefully, this will be a help to someone. A colleague of mine last week started having issues with one of our projects where poetry would never finish resolving deps. We tried uninstalling poetry and then installing via brew. We then tried removing the virtual environment instance associated with the project but no luck.
We eventually wondered if it might be a local caching issue. We ended up running the following command to blow out the pypi cache.
poetry cache clear --all pypi
Once it was cleared he was able to perform poetry update and poetry install again without issue. I can’t guarantee it will help everyone but hopefully, it’ll be useful to some on this thread.
We’ve been having issues with slow resolution, and I’ve looked into it a bit deeper. Seems to be (partially) caused by the private registry being a primary source, and (partially) by a bad cache.
Poetry version:
I did a cProfile of
poetry update -vvv
and saw where the bottleneck was.The
pyproject.toml
I’m testing with has quite a few packages, some of them public, others from the private repo:By making it
secondary = true
it cut out the time from 70.x seconds down to 33.x seconds. And that’s because:secondary = false
:LegacyRepository._get_release_info
was called 50 times (30k ms total)secondary = true
:PyPiRepository._get_release_info
was only called once (422 ms total)At first glance it looks like the caching config is very different between
LegacyRepository
andPyPiRepository
.The size of the
pypi
cache was in the 50MB range, while our private repo had a cache in the range of a few hundred KBs.The other 30s are spent in
pool.find_packages (100 times) -> LegacyRepository.find_packages (50 times) -> _get (50 times)
called regardless ofsecondary
.Here’s the gist with the profiling script I was using. Can’t share the
pyproject.toml
because it would need private credentials anyway.If the issue is that certain python libraries do not specify the requirements, could poetry have some command/option to shed some light onto which libraries are creating the problem? This would help developers either:
poetry took forever to install a blank project. the only things in pyproject.toml were python version, and pytest. Then I upgraded pip from version 20.2.3 to 21.1.2. Now poetry install finishes in a flash. I can stop pulling out my remaining strands of hair…
(also: pypi
¯\_(ツ)_/¯
)@joannayoo0117 @thesofakillers Try poetry 1.2.0rc1 if you haven’t already. Our lock times went from hours to minutes with poetry 1.2.0b3 and 1.2.0rc1. You can wait a bit until the final release of 1.2.0 is out, which should happen next week. Beware that, in general, you can’t use poetry 1.1 with
poetry.lock
files created by poetry 1.2. We are also settinginstaller.max-workers
to1
Hey, I had this issue. Solving dependencies was over 1000 seconds without finishing when trying to add a dependency. Ran poetry add <dep> -vvv and it showed that it was having issues resolving boto3. Was trying every minor. I installed same boto3 as <dep> expected, and it worked fine. Maybe this helps someone out there.
What I found in
Was that a corrupted Poetry cache (probably due to Ctrl-C’ing something
poetry
was doing mid-operation) caused dependency resolution to hang forever.To fix it I cleared the cache:
Also, running
poetry
commands with-vvv
is very useful to see ifpoetry
is still actually working on resolving dependencies or is hung up.@abn Sure thing! I just reran the setup from scratch with the alternative profiler output format. speedscope.json.txt speedscope2.json.txt speedscope3.json.txt speedscope4.json.txt speedscope5.json.txt speedscope6.json.txt speedscope7.json.txt logSpeedScope.txt
I’m having these issues:
poetry self update
gets stuck (I know I’m using the latest but wanted to try just in case)Things tried that didn’t change anything:
repositories.testpypi
config set tohttps://test.pypi.org/legacy/
, I removed it just in case.-vvv
, the only difference is thatpoetry install
then shows few packages that is skipping and then it hits the first that wants to install and gets stuck.What did provide a different result:
After that the new results are:
poetry install
: worked like a charm and super fast.poetry add openpyxl
: apparently stuck at resolving dependencies in the first try, tried again after some minutes and got resolved in 8 seconds and the package successfully installed.poetry self update
: now immediately answers “You are using the latest version”.Great! I don’t understand the issue but I can finally continue developing, thanks to everyone suggesting workarounds in this thread.
After that I switched back to ethernet and tried:
poetry install
: says no new dependencies to install, fair enough.poetry update
(I don’t really need it but won’t harm, and for the sake of debugging this issue): after 163 seconds tryng to ‘resolve dependencies’ I cancelled.poetry update
again: after 60 seconds I cancelled.poetry update -vvv
, worked like a charm and super fast.I hope this helps in some way, thanks everyone for the efforts.
Clearing the cache is most likely related to clearing out partial/incomplete/corrupted by concurrent usage downloads that can cause an indefinite hang.
i can confirm https://github.com/python-poetry/poetry/issues/2094#issuecomment-1070340891
my “workaround” is fequent retrying:
-vv
, see if it hangs “too long”Same problem here on poetry 2.13.0 and ubuntu 20.04.3 LTS. Adding
jupyter
take forever.I wonder if maybe some of you are finding themselves in a situation similar to this:
For project with few dependencies this fails quickly, but I could also imagine that with lots of dependencies it could be resolving for what seems forever. Might be totally unrelated, might be a false lead. But maybe it’s worth looking into this…
In your
pyproject.toml
set the Python requirement to a fix major-minor and see if it helps the dependency resolution. For example:instead of
>=3.6
or^3.6
, etc.[I am not a maintainer]
My recommendation to pin or restrict the version ranges of some dependencies, or even some indirect dependencies, is just a workaround to help the dependency resolution algorithm in cases where it is struggling to find a suitable combination of distributions to install. If/when poetry’s dependency resolution gets better those version pins and restrictions could probably be removed.
Otherwise, there are some things that maybe help (or maybe not, hard to tell, since there are so many cases presented here, and I’m not even sure they are all due to dependency resolution):
python
restriction in yourpyproject.toml
is compatible with your dependencies (I think I remember seeing quite some cases where it would lead to unsolvable dependencies, I could try to find those again to show you what I’m talking about)@cglacet
Yes. That would be PyPA. They know all about these kinds of issues. They are actively working on solving them. These things take time. There is no need to lobby. There is need to participate with writing good documentation and good code. And most important of all, donate to fund developers to work full time on it.
This is a part of the work, yes. Once this rolls out, PyPA will be able to move on to solving other packaging issues. This work was partly done thanks to financial grants (money donations).
You can read more about related, ongoing work (these links are only a short, semi-random selection, but they are all somewhat intertwined):
Yes. From my point of view, issue is that the overwhelming majority of advice found on the internet (articles, blogs, StackOverflow answers, etc.) is either outdated, misguided, or plain wrong.
A good reference is this website (from PyPA itself):
If you follow poetry’s workflows you are already in very good hands, and you should not worry about anything too much. Upload wheels! Well, you need to upload both sdists and wheels. The sdists are still very important, do not forget them.
Yes, it is also doing a very good job at getting rid of outdated, bad practices.
[Sadly somehow, there are always users pushing for poetry to adapt to their own broken workflows, instead of users changing their habits for the clean workflows of poetry. It is a constant battle.]
Yes, this was another great step forward. Python packaging ecosystem is improving a lot these days.
And yes, exactly, a great hurdle is keeping the compatibility with older projects. This slows down the work a lot. In particular older, broken setuptools / distutils
setup.py
based projects are very problematic. Although it is nowadays entirely possible to write clean, well-behaved, and standard-conform setuptools based projects.[I am writing this of the top of my head according to the bits of info I have gathered here and there along the way. I do not have insight into all the processes involved, so there might be some inaccuracies. Feel free to correct me. Feel free to ask me for clarifications.]
Hi all,
This issue has gotten quite long and meandering, with many disparate causes, solutions, fixed issues, perhaps still extant bugs, and many “me too” comments all discussed.
I’m going to close this issue as most of the root causes discussed within have either been solved in 1.2 or 1.3 (the changes in 1.3 are behavior changes not eligible for backport).
If you are having issues with Poetry taking a long time to resolve dependencies, please first open a Discussion or start on Discord, as many of them are related to configuration and large search space (basically, you’re creating exponential work for the solver and should tighten your constraints). Tooling to advise the user should be possible (if difficult to develop) in the long run, and anyone interested in tackling this hairy problem should reach out to the team via a Discussion or on Discord.
Past that, please make sure to test with the latest code (both on the 1.2 branch and master branch presently) when trying to reproduce resolver issues as we are making improvements all the time, and your issue may be fixed and pending release already.
Finally, good reproductions are needed for this category of issue. Many times they are related to transient network issues, pathologically bad cases due to decisions made around (low traffic, private) custom package indexes, or a corrupted cache/bad local config. Reproducing in a container with publicly available packages will mean that someone can dissect your issue and possibly fix it. If you can’t reproduce it with public packages, but you can with private packages, there are still options – everything from sharing details with the team in private, to creating ‘imitation’ repositories to reproduce an issue.
Please refrain from commenting on this issue more if it’s not a mitigation/common solution – “me too” is not very helpful and will send dozens of emails, and if you can reproduce bad performance consistently/in a clean environment it should be an issue. If you’re stuck and need help, ask for support using one of the methods mentioned above.
Hi everyone,
inspired by the quoted comment, I wrote a “one-line” docker command to perform
poetry lock
from within a container. For me this makes a huge improvement, going from occasionally >1 hour, when I runpoetry lock
directly on my M1 Mac shell, down to consistently sub 90 seconds, when run inside the container on the same machine.…to be run from the same folder, where the
pyproject.toml
andpoetry.lock
are located.This is a base command for public dependencies only. You can also e.g. resolve relative local dependencies by mounting them into
/workdir
or from private repositories, by passing the authorization config via environment vars.I can only speculate, why this works. Maybe by making poetry believe it would run from a ARM64 Linux platform instead of Mac it will actually be able to pull and analyze pre-built wheels in the majority of cases, instead of installing and building from source, or sth. like that?
PS: A proper
poetry install
can only to be done on the bare target machine - not from within a container afaik, but that’s not the slow bit in my case.This helped me:
poetry cache clear --all .
. Then re-runpoetry update -vvv
. I assume some faulty/interrupted download got in the way.I think what might at least be useful here is some tooling to at least help you find out which packages are the ones causing the issues, so you can move those specific ones out. Is there any way to do an analysis? The output of
poetry lock
is very spartan. Right now I’m just adding dependencies one by one and hoping I hit upon the slow one, which looks similar to your approach @john-sandall .@finswimmer What’s the process for identifying a/the bottleneck in dependency resolution? If there’s not something already, it would be great to have a built-in tool/flag that would be able to give us feedback on slow-to-resolve packages.
I’d happily spend the time needed talking to, working with, and submitting PRs to projects that are slowing things down for my team, if I had a good way to identify.
things that helped on my end was to update poetry, clear the cache, and set upper limit to python version: poetry self update poetry cache clear --all . poetry config experimental.new-installer false
[tool.poetry.dependencies] python = “>=3.9,<4.0”
then create debug log to confirm fewer iterations of dependency checks: poetry update -vvv | tee ./poetry_update.log
@abn Sure, I’ll submit a new issue soon. I managed to get it working,
poetry update
now runs in 22 seconds.I just noticed that the issue for me seemed related to using boto3 without specifying a version in the package I was importing. So I had package A that I built using poetry with
boto3 = '*'
. That did seem to resolve fairly quickly. But when I tried to import package A into a new package, B, it took >10 minutes to resolve (if it would ever finish). I specified the version used by package A for boto3 in package B, and it resolved my dependencies in < 30 seconds.In my case the slow dependency resolution in Poetry was related to an IPv6 issue (also see this related answer on StackOverflow). Temporarily disabling IPv6 solved it. On Ubuntu this can be achieved using the following commands:
@tall-josh Because Poetry had (maybe still has?) a bug where corrupted cache entries make Poetry hang forever when trying to resolve dependencies.
For me this seemed to occur if I Ctrl+C’d Poetry while it was doing an install and it was downloading packages.
I observed this on 1.1.14, so perhaps it’s fixed in 1.2+.
@tall-josh Have you tried disabling ipv6? It solved it for me. Not sure if it works for private pypi though.
A lock died after ~10 hrs overnight:
My config:
N.b.
stuff.work.lan
is a generally reliable Artifactory instance inside my corporate firewall. Its name has been changed for public posting, obvs. I’m connecting over a VPN but can pretty normally hit 15 MB/s download with bursts up to ~25 MB/s, so throughput isn’t a culprit.I’m trying the cache clear with
poetry cache clear --all python-public-proxy
and rerunning with-vvv
and capturing the output. Peeking at it, I’m seeing numpy a lot.Edit: Clearing the cache seemed to be effective. The captured run took about 10 minutes.
I temporarily disabled my IPv6 to work on the project with poetry. In my case, I am developing in a Linux WSL2 environment, and I followed the tips on this site to disable it:
https://itsfoss.com/disable-ipv6-ubuntu-linux/
@abn I was curious too and tried to do this, but in the meantime IPv6 suddenly became supported on my connection and I can no longer reproduce the issue.
For anyone who still has this issue it would be very interesting to see a flamegraph created by:
I have removed
jupyter
andipykernel
frompyproject.toml
and just installed those packages usingconda
and included it in myenvironment.yml
file. Everything works right now!Make sure that
poetry --version
is1.1.12
because some older version that I have had was funky when installingnumpy
.Same problem here. Sometimes some seconds, sometimes a long. It’s not related to specific packages. I am experiencing such an issue even when I start a new project from scratch with
poetry init
. But not all the times. Really annoying. See https://github.com/python-poetry/poetry/issues/4855 for moreIt went away by itself. I believe it was a problem with my ISP…
Maybe make sure you are getting a valid ipv6 address? I wasn’t getting ipv6 because network manager was conflicting with dhcpcd. Disabled dhcpcd and I could use dhcpv6.
Sometimes you can monitor SNR, TX, RX values for your modem. Check if those metrics are healthy.
DNS can also be a problem. Maybe try different DNS servers. I recommend trying to rollout your own with pihole + unbound but that setup is more involved and may not help you with your problem with pypi
I found my problem was with the internet connection via WSL2. I changed the nameserver in the
/etc/resolv.conf
file to 8.8.8.8, and Poetry (version 1.1.11) was normal again. https://github.com/microsoft/WSL/issues/5420#issuecomment-646479747I found that this occurred when I tried to add pandas to a pyproject with python = “^3.7.9”.
I think what was causing the issue was numpy (which pandas installs). When I tried to add numpy separately, I got the warning
numpy requires Python >=3.7.9,<3.11, so it will not be satisfied for Python >=3.7,<4.0.0
.Once I switched my pyproject python to python = “>=3.7.9,❤️.11”, this slowness immediately went away.
Correction: pip used to “not do dependency resolution”. And it was often not working fine, that was probably one of the main reasons why people wrote poetry and migrated to it.
pip now has an actual dependency resolution algorithm since some months, I can’t find the actual release date and version number. Looks like it’s
20.3
(2020-11-30):What if we create a service from pydeps which can take millions of requests from around the world, then make
poetry
to use that service to resolve dependencies?If you are looking for projects to help contribute to that don’t yet have wheels, this site lists the top 360 packages, a handful of which don’t have wheels: https://pythonwheels.com/
@cglacet
1. Yes. True for both wheels and sdists. They have to be downloaded. Although there is some ongoing work that would result in the possibility to skip the download for the wheels.
2. Yes and no. True for both wheels and sdists, these archives have to be “opened” and some files have to be read to figure if there are dependencies and what they are. But this is not the part that is slow. The slow part, is that for sdists (not for wheels) just opening the archive and reading some files is not enough, those files have to be built (execute the
setup.py
for example) and in some cases a resource intensive compilation step is necessary (C extensions for example need to be compiled with a C compiler which is usually the very slow bit of the whole process).As far as I know, there is and subsequent dependency resolutions for the same projects should be faster (download and build steps can be partially be skipped). The wheels built locally in previous attempts are reused.
Yes, a bit off-topic but I believe it is helpful for the next users wondering the slow dependency resolution to read some insight into why.
Some good reading I could find on the spot:
Update:
Thinking about it more, I realize I might have mischaracterised things. Getting the metadata (including dependency requirements) out of an sdist, does not require compiling the C extensions (
setup.py build
). It should be enough to get the egg info (setup.py egg_info
).As far as I know: If your project and all its dependencies (and their dependencies) are available for your platform (Python interpreter minor version, operating system, and CPU bitness) as wheels, then it is the best case scenario. Because in a wheel, the dependencies are defined statically (no need to build a sdist to figure out what the exact dependencies are, for example).
You, as the developer (maintainer) of a project, the best you can do to help lower the difficulty of dependency resolution for everyone else, is to distribute the wheels of your project for as many platforms as possible (upload the
.whl
files to PyPI). Often projects (libraries, applications) are made of pure Python code (no C extension for example) so just 1 wheel is enough to cover all platforms.@David-OConnor, what’s your suggestion for resolving things in the immediate term. How can I determine which package is causing the slow down? I am more than happy to make a PR to whichever project that is, but as it is now, any changes to pyproject.toml takes upwards of 20 minutes. When I run with
-vvv
, I see1: derived: pyyaml (^5.3.1)
as the last line before it hangs for several minutes, but I would assume you are doing installation asynchronously or something.I figured I would add more to this issue. It’s taking more than 20 minutes for me:
This is the pyproject.toml:
It’d be more productive to file an issue on https://github.com/pypa/warehouse, to ask this. There’s either a good reason, or PyPI would be open to adding this functionality. In the latter case, depending on how the details work out, it might need to be standardized like pyproject.toml was before poetry adopted it, so that the entire ecosystem can depend on and utilize it.
Reducing the search space is not a workaround, but a common issue (as mentioned in my comment above). That being said, the version you have written should be equivalent (
^3.10.0
and^3.10
) – any differences you noticed with that change are likely unrelated.@kache, It appears to search through dependencies depth-first, rather than breadth-first. As a result, you’ve probably got a something earlier in your pyproject.toml that depends on ddtrace, so the dependency resolver grabbed that version and tried to resolve using that, rather than the ddtrace version you’ve specified.
I’ve had some success moving the dependencies I want exact version logic prioritizing earlier in the pyproject.toml file.
(I also disabled IPv6, upgraded to poetry 1.2x, and have reduced the possible space for the troubling aws libraries (boto3 and awscli, for me) so those go at the very end of my dependency file and have only a few recent versions to chew through.
I’m seeing dependency resolution time between 5 and 35 seconds most of the time now.
OK thanks. My workaround is to use
poetry add
for every package I have instead ofpoetry install
.Weird behaviour:
poetry install
: 30mn+ dependencies resolutionpoetry add
: I did a basic script that adds every package sequentially. It takes around 5-6mn to resolve dependencies for the same toml used before.I can’t understand the issue here… Might be a caching problem?
I’m on a work machine where I don’t think that’s an option for me. I can try the resolution on my local machine with IPV6 disable and see if that helps though. Cheers.
Update: I tried disabling ipv6 but it did not seem to have an effect 😦
Hello,
For PyPI packages poetry works well, but in my work we use Codeartifact and there is the issue.
I added something like
and then mine started going very slow. almost like all the caches stopped working. the flame graph is full of
is it possible that the cache headers on my private repo are wrong? but if i set it to secondary i would assume if it found the package in the primary then it would not even use the private repo.
@teichert Nice work pulling these together! the giant chunks of time in read() from ssl.py could point to time spent waiting for bytes from an SSL socket that for some reason isn’t sending any data, right?
Wow, this solved it for me. Thank you!
Running poetry inside an ubuntu container (see answer by @sekalkowski) worked perfectly. Looks to be an issue related to MacOS (I have an M1 chip)
This answer on the FAQ is misleading, as this doesn’t seem to be an error with Pypi’s API https://python-poetry.org/docs/faq/#why-is-the-dependency-resolution-process-slow
It doesn’t feel like latency to me, I was doing tests using fiber connection and through ethernet, my latency to pypi.org is around 14ms and super stable.
I want to add some more information to my tests: I’m also installing lots of packages every time I build a docker image or I use tox for running the tests, both things happen very often (daily basis) and we’re using poetry in this builds as well. This issue didn’t show up for neither docker or tox building processes.
Missatge de Ryan Munro @.***> del dia dl., 11 d’abr. 2022 a les 15:51:
i was able to work around the dependency resolution at least by just running
poetry update
over and over and over again (ran with-vvv
to examine output). Seems like the ‘non-deterministic’ behavior described above might just be the installer making a little more progress each time.Interestingly, even after dependencies were resolved, the installer still got hung up on ‘pending…’ for several numerical libraries. Looks like something going on with gcc or zlib or some other linux package.
Tries setting
poetry config installer.parallel false
at some point… not sure if that is helping or hurting.Just upgraded to python3.10 using pyenv on WSL 2
I noticed that the workflow is single threaded. Any chance we could use all threads available, or at least half, for parsing?
@harrybiddle see this discussion: https://github.com/pypa/pip/pull/10258.
I’m facing the same issue here (recently changed my ISP)
I think I’m waiting too much for poetry to finish.
Hello @lassepe,
thanks for sharing your experience. The bottleneck for the dependency resolution there (and I saw this in other cases before) is the dependency
awscli
.awscli
doesn’t provide any information about it’s dependencies via PyPi’s json api, so poetry needs to download the package, extract and parse the information. Furthermore there are plenty of version. It seems that every version has different dependencies. This seems to be the worst case for building a dependency tree.I have no idea about the algorithm and if it’s possible to optimize it for such a situation. I guess it would help at least a but if
awscli
would provide the dependency information via the API.Anybody else who comes here with an absurdly long (upwards of 1 hour) dependency resolution, it’s probably because you are being prompted for the passphrase to an ssh key. The timer display keeps overwriting the last line in the terminal, such as the ssh password prompt. Solution: make sure any keys you need are added to ssh-agent.
Here’s a pretty good summary of why pip and requirements.txt isn’t sufficient for many use cases:
https://modelpredict.com/wht-requirements-txt-is-not-enough
Those numbers are very surprising. There does not seem to be anything particularly difficult happening here. It does not seem to be the dependency resolver that is struggling (doing lots of back-tracking or things like that). I wonder what it could be… Maybe open a separate bug ticket, I feel like this issue might be different than the rest of the thread.
Guess this is the root cause for my case too. I have setup a proxy for pypi, it still runs slow (with -vvv options on, I can see it’s progressing), and I know how it works now. A lot of files need to be download before the deps is resolved.
Hope poetry can support mirror pip source, so the performance issue will be solved.
@cglacet
Upload a
poetry.lock
to CI will avoid resolving dependencies.The discussion on python.org might be interesting for some as well: Standardized way for receiving dependencies
Yep. Anecdotally, if deps are specified at all on PyPi, they’re probably accurate. If not, it could mean either deps aren’t specified, or there are no deps.
Pypi not fixing this is irresponsible. Yes, I’m throwing spears at people doing work in their free time, but this needs to be fixed, even if that means being mean.
@David-OConnor Is there a technical reason for that? Isn’t it possible to check wether a package correctly specifies its dependencies?
Discussions here are quite interesting for a noob like me. I have a very naïve question though. Are packages built/published with poetry “correctly specifying their dependencies”. In other words, imagine I only add packages built with poetry, will the resolving phase be lightening fast? Or will this still apply:
I first came here because I though 40s for resolving a single package dependencies was slow, but when I see minutes and hours to the counter I suppose it is normal.
I guess it’s not a good idea to use poetry for creating docker images (in CI pipelines for example)?
Another example. Adding
black
took 284 seconds.Unfortunately I can’t share the
pyproject.toml
.A pypi mirror with faster network access in mainland China.
Follwing @lmarsden suggestion I managed to speed up the process by making sure that sites/servers that prefer ipv4 use ipv4. On ubuntu I modified
/etc/gai.conf
by removing#
(uncommenting) the following line:# precedence ::ffff:0:0/96 100
I realise that now as I mention in #2338 . I’m therefor not that interested in poetry at the moment. I taught it was like composer and https://packagist.org, but it looks mostly like a wrapper around differents legacy tools.