pipenv: pipenv install is very slow
Running pipenv install
after changing one dependency takes about ~5 minutes for me, on a windows 10 machine with an SSD.
The vast majority of that time is spent inside Locking [packages] dependencies...
It seems like there might be some quadratic-or-worse complexity in this step?
I’ve included most of our pipfile below, but I had to remove some of our private repo dependencies:
[[source]]
url = "https://pypi.python.org/simple"
verify_ssl = true
[packages]
alembic = "==0.8.4"
amqp = "==1.4.7"
analytics-python = "==1.2.5"
anyjson = "==0.3.3"
billiard = "==3.3.0.20"
braintree = "==3.20.0"
celery = "==3.1.18"
coverage = "==4.0.3"
docopt = "==0.4.0"
eventlet = "==0.19.0"
flake8 = "==3.0.4"
Flask-Cors = "==2.1.2"
Flask-Login = "==0.3.2"
Flask = "==0.12.1"
funcsigs = "==0.4"
fuzzywuzzy = "==0.12.0"
gcloud = "==0.14.0"
html2text = "==2016.9.19"
itsdangerous = "==0.24"
Jinja2 = "==2.8"
jsonpatch = "==1.15"
jsonschema = "==2.5.1"
PyJWT = "==1.4.2"
kombu = "==3.0.30"
LayerClient = "==0.1.9"
MarkupSafe = "==0.23"
mixpanel = "==4.3.0"
mock = "==1.3.0"
nose-exclude = "==0.4.1"
nose = "==1.3.7"
numpy = "==1.12.1"
pdfrw = "==0.3"
Pillow = "==4.1.0"
pusher = "==1.6"
pycountry = "==1.20"
pycryptodome = "==3.4.5"
pymongo = "==3.2"
PyMySQL = "==0.7.4"
python-dateutil = "<=2.5.1"
python-Levenshtein = "==0.12.0"
python-magic = "==0.4.6"
python-coveralls = "==2.9.0"
pytz = "==2015.6"
raygun4py = "==3.1.2"
"repoze.retry" = "==1.3"
requests = "==2.8.1"
sendgrid = "==2.2.1"
slacker = "==0.7.3"
SQLAlchemy-Enum34 = "==1.0.1"
SQLAlchemy-Utils = "==0.31.6"
SQLAlchemy = "==1.1.9"
typing = "==3.5.2.2"
twilio = "==5.6.0"
Unidecode = "==0.4.19"
voluptuous = "==0.8.11"
Wand = "==0.4.4"
watchdog = "==0.8.3"
Werkzeug = "==0.12.1"
wheel = "==0.24.0"
WTForms = "==2.0.2"
xmltodict = "==0.9.2"
zeep = "==0.24.0"
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 69
- Comments: 108 (53 by maintainers)
Commits related to this issue
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
Why are all the issues to this topic closed? I can’t pipenv install a single thing due to the lock-step hang.
Right now for me it is not slow, it is freezing…
A
pipenv install my_package
or a simplepipenv install
does not give me any output, after 20 minutes.EDIT: Confirmation, still nothing after a few hours. Is it the same problem? Usually it was slow, but it ended after 5 to 10 minutes.
Note for future visitors: Please refrain from posting your slow installation result. We know it can be slow. We know why it is slow. Your result does not add anything to this topic.
Given that most of the packages are going to stay the same over time, would it be possible to cache the downloaded packages?
Same here. Pipenv very slow. Takes an hour to lock and install.
I know this issue is closed, But for me while installing pandas taking lot of time. The verbose output is this
It’s stuck at Locking for more than 30 minutes. I am using python 3.7.0, macos mojave. Any help with that.
pipenv is awesome, but this issue seems to be still existing. will be glad to see any progress. --skip-lock did not work.
Would it be possible to have some information or progress bar like apt-get or wget (download speed, size downloaded, total size) during the libraries downloads? I guess this is the issue here, pipenv seemed slow for me but it was just the library download, I had to open a system monitor to understand that pipenv was downloading the files and how much was already downloaded, what speed etc
@jkp Allow me to assure you that each and every core developer of this project (not that there are many of us to begin with) is very well aware of this issue, and is as troubled by it as you are, if not more. This is, however, by no means an easy problem, and we have already throw everything we could at it to make it as usable as possible, without having to tear everything down in Python packaging. We do also have a lot on our plate at the moment, and need to also focus on those other issues. The inevitable decision me must make, then, is to prioritise issues we actually know how to solve, and only start thinking about our next steps after those are done, to maximise the effect of our effort.
Now, I fully acknowledge that your priority may be different from ours. This performace issue can be the single most largest problem in your workflow, and you want to put it up as the most important thing in this project. Please bear in mind, however, that you are not the only user of this tool, and we need to prioritise the need of all, even the need of our own, in front of yours. I acknowledge that. I urge, therefore, anyone sharing the situation to join minds on this issue, and try to think of a way to solve it. Once we know what to actually do, we can do it.
The issue is kept closed because we do not know what we can do, and would serve only as noise in the issue tracker when we try to manage it. There is no point, as least in our workflow, to have an issue that nobody can work on.
This is definitely still a problem (5+ mins), with latest python3.6, pip, and pipenv versions, and installing a simple package like
torch
. I don’t think this issue should be marked as closed.hava same issue:
Locking [packages] dependencies...
hang forever my environment:We sprinted on this together at PyCon. It’ll be faster soon.
Please, please provide useful output. I just upgraded my pipenv from 9.1.0 to 11.10.0 in order to resolve the Invalid Marker failure at package lock step as per, e.g., #1622 — now, I have a pipfile with ipykernel, pandas, jupyter, numpy, and matplotlib in there and with my latest attempt to use
pipenv install
to get the lock file going, I’ve been sitting inlocking [packages] dependencies…
for upwards of 10 minutes.Because there’s no output, I can’t tell whether there is something actually going (like building numpy from source) or if it’s just hanging. The best I can do is sort of squint at
top
and conclude that maybe it’s doing something because a python process seems to be hanging around… but I’m going to have to trash this virtualenv and start fresh if something doesn’t move soon.I’m happy to contribute to some work on this if needed.
Also vote for this issue to be reopened. It is far from solved…take somewhere between 10-20 minutes to lock my project within a Docker container. It also uses an insane amount of memory - such that I had to increase the allocation to Docker to stop it from killing the process.
The following docker image takes more than 30 minutes to build on my laptop (i7/16Gb), the
pipenv install ...
command runs for ages…Dockerfile
Pipfile
Is this normal? Can someone reproduce?
Update: Be careful with
Alpine Linux
I realized that the issue is not on
pipenv
’s side…I replaced the Alpine base docker-image with one that is built on Debian-Slim and now
pipenv install
finishes within seconds.The issue in my example is that Alpine Linux will always try to build packages which contain
cython-extension
orc-extensions
from source, which can take forever in a Docker container, whereas Debian Linux installs them using thewheel
format, which happens A LOT faster (within seconds).More on this: https://stackoverflow.com/questions/49037742/why-does-it-take-ages-to-install-pandas-on-alpine-linux
@crifan there is no need to post the same message on every issue open or closed which mentions locking speed. We will see your comment no matter how many times you say the same thing. If you want to be helpful, you will need to provide a reproducible example case. Chiming in to say “me too” simply doesn’t add anything besides extra traffic on the issue tracker. Please be mindful of that.
Appreciate your workflow so if that is how you manage issues that’s fine. I’ll try to add any information I can to help track down the problem.
I did some debugging by installing
mitmproxy
betweenpipenv
and the net to trace the requests being made. I found a couple of interesting things.We are using a private
pypi
index that doesn’t support thejson-api
yet. This slows things down significantly since it looks like the fallback is to bruteforce and download everything listed in http index in order to extract metadata etc. One suggestion here would be just to add some simple logging to warn that this fallback method is being used - it might save someone else having to dig deeper to figure this out.Using the brute-force method it seems that the code downloads packages that aren’t relevant to the architecture in use. For example on a linux machine it was downloading
win32
orosx
specific wheel packages. This feels like something that could be detected and avoided since it’s clear that binary packages built for other architectures are not going to be of any use.I will continue to debug and report back any useful info as I find it.
Here I am… 15 mins later
Oh, thank you. I think that was the only thing holding my back from pushing everyone over to pipenv at work.
We have done some performance enhancements to pipenv in recent times, including a big install optimization which released 2022.8.31. For an independent comparison of benchmarks, please have a look at: lincolnloop.github.io/python-package-manager-shootout
Please use virtualenv for time being. until better solution available
I think this issue has been answered well here: https://github.com/pypa/pipenv/issues/1914#issuecomment-378846926
We should probably move the discussion to #2284. It is actually the locking part that is slow (
install
is essentially TOML manipulation +lock
+sync
), not installing.Hi, getting the same problem. here is the verbose
Here is the
pipenv --version
: pipenv, version 2018.05.18Also i don’t know why this is happening no specific reason/ error is occurring. In my case when I do
pipenv lock
it starts but it never ends as far as I know. I have been waiting for 2 hrs now still no sign of completion. And has given me ReadTimeoutError twice this is the third time I am doing this. Using Python 3.6.4Any help would be of great help as my projects due date is closer.
Let me know if there’s any info I can provide to help diagnose. For the moment I’ll just go back to pip + virtualenv + pip-tools 😕
It takes quite a lot to run on my machine. macOS 10.13.4, pipenv, version 11.10.0
Download runs almost immediately, but then it get’s stuck on
Locking [packages] dependencies…
. Here’s taking half a minute for two dependencies, and then 6 minutes for another 3 dependencies, if I throw all my project’s dependencies at it, it just hangs indefinitely at the locking stepWow, nice, that was literally more than a 100x speedup for me, and it also caught a dependency conflict that the previous version didn’t catch!
What would be useful is a
verbose
flag forpipenv lock
- I was only able to diagnose the dependency conflict by editingpiptools/logging.py
to enable verbose logging, but once I did that it gave a very clear indication of what was going on.99% of the time I do this, the dependencies will resolve to the same one in my lock file, this is because it’s part of my dev pipeline.
In the case where there are no new upstream packages since the last run, surely the process could be skipped ?
Would it be possible to get the list of hashes from the PyPI API, rather than compute them ourselves?
@keshavkaul PySpark is very large, and can take quite some time just to download. Give it some time, it will be much better afterwards (because Pipenv caches the result).
(Or you can urge the developers to release a wheel distribution. That would help a bit.)
It seems like even with the json interface in use
pipenv
is making unnecessary requests for wheel files that relate to different architectures. The implementation is currently pretty naive in that it checks all files listed against a given release, regardless of the target platform / arch.Minimal test-case:
On a Linux host:
pipenv install grpcio
Produced the following requests (captured using
mitmproxy
):Counting up some of the unnecessary requests:
Etc. Seems like a quick win to do some simple filtering based on host os and arch?
@AndreasPresthammer so your script is just timing an uncached lock vs installing with lock. We know that it’s the locking but that’s slow. In the case of numpy it’s probably because it’s had to use sdists for resolution in the past which meant compilation. We can use wheels now. That may speed things up
It does have the general feel of being slower than it should be, but maybe I’m underestimating the issue. If I look at the running processes in my computer, I can see the python one running the whole time pipenv is running, and it never goes above ~15%, it should probably use more of it if it’s doing cpu intensive work like hashing files. Also, I’ve used other package managers that hash dependencies, like yarn, and they are pretty fast.
Perhaps a message such as
This may take a long time
would help to reassure people until a clearer solution is decided upon.@pablote holy crap that is slow! Do note that this is partly due to the installation of numpy, which I’m sure we are probalby compiling from source to lock or something stupid. would help if we provided useful output here
@giantas not helpful. Please provide pipfile, output, and duration with pip install. Also whether you can do invidual packages.
@kavdev @jtratner introduced a feature to cache the hashes as well so that should make a sizeable improvement
fixed! lock is wicked fast now.
Glad you got things resolve @NicolasWebDev! We’re working on getting this sped up more, hopefully #373 will be a step closer in the next release.