pipenv: Locking is slow (and performs redundant downloads)
Is this an issue with my installation? It happens on all of my machines… Is there anything I/we can do to speed it up?
I install one package and the locking seems to take minutes.
Locking [packages] dependencies…
$ python -m pipenv.help output
Pipenv version: '2018.05.18'
Pipenv location: '/Users/colllin/miniconda3/lib/python3.6/site-packages/pipenv'
Python location: '/Users/colllin/miniconda3/bin/python'
Other Python installations in PATH
:
-
2.7
:/usr/bin/python2.7
-
2.7
:/usr/bin/python2.7
-
3.6
:/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6m
-
3.6
:/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6
-
3.6
:/Users/colllin/miniconda3/bin/python3.6
-
3.6
:/Users/colllin/.pyenv/shims/python3.6
-
3.6
:/usr/local/bin/python3.6
-
3.6.3
:/Users/colllin/miniconda3/bin/python
-
3.6.3
:/Users/colllin/.pyenv/shims/python
-
2.7.10
:/usr/bin/python
-
3.6.4
:/Library/Frameworks/Python.framework/Versions/3.6/bin/python3
-
3.6.3
:/Users/colllin/miniconda3/bin/python3
-
3.6.4
:/Users/colllin/.pyenv/shims/python3
-
3.6.4
:/usr/local/bin/python3
PEP 508 Information:
{'implementation_name': 'cpython',
'implementation_version': '3.6.3',
'os_name': 'posix',
'platform_machine': 'x86_64',
'platform_python_implementation': 'CPython',
'platform_release': '17.5.0',
'platform_system': 'Darwin',
'platform_version': 'Darwin Kernel Version 17.5.0: Mon Mar 5 22:24:32 PST '
'2018; root:xnu-4570.51.1~1/RELEASE_X86_64',
'python_full_version': '3.6.3',
'python_version': '3.6',
'sys_platform': 'darwin'}
System environment variables:
TERM_PROGRAM
NVM_CD_FLAGS
TERM
SHELL
TMPDIR
Apple_PubSub_Socket_Render
TERM_PROGRAM_VERSION
TERM_SESSION_ID
NVM_DIR
USER
SSH_AUTH_SOCK
PYENV_VIRTUALENV_INIT
PATH
PWD
LANG
XPC_FLAGS
PS1
XPC_SERVICE_NAME
PYENV_SHELL
HOME
SHLVL
DRAM_ROOT
LOGNAME
NVM_BIN
SECURITYSESSIONID
_
__CF_USER_TEXT_ENCODING
PYTHONDONTWRITEBYTECODE
PIP_PYTHON_PATH
Pipenv–specific environment variables:
Debug–specific environment variables:
PATH
:/Library/Frameworks/Python.framework/Versions/3.6/bin:/Users/colllin/miniconda3/bin:/Users/colllin/.pyenv/plugins/pyenv-virtualenv/shims:/Users/colllin/.pyenv/shims:/Users/colllin/.pyenv/bin:/Users/colllin/.nvm/versions/node/v8.1.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
SHELL
:/bin/bash
LANG
:en_US.UTF-8
PWD
:/Users/.../folder
Contents of Pipfile
(‘/Users/…/Pipfile’):
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
gym-retro = "*"
[dev-packages]
[requires]
python_version = "3.6"
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 105
- Comments: 76 (16 by maintainers)
Commits related to this issue
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
- Locking down packages manually until pipenv matures enough Sadly due to several issues with Pipfile.lock, it seems to either hang or take forever as seen here https://github.com/pypa/pipenv/issues/22... — committed to Mathspy/tic-tac-toe-NN by Mathspy 6 years ago
I noticed that
lock
was really slow and downloaded huge amount of data fromfiles.pythonhosted.org
, more than 800MB for a small project that depends onscipy
flask
etc.So I sniffed the requests made to
files.pythonhosted.org
, and it turns out that pip or pipenv were doing completely unnecessary downloads, which makeslock
painfully slow.For example, same version
numpy
had been downloaded several times in full. And it downloaded wheels for windows / linux, although I was using a Mac.My setup:
this is pretty bad to the point I am afraid to install new python libs or upgrade existing ones.
The slowness of pipenv really hinders dev process for us. I now advise everyone to stick with pip + virtualenv until this issue is resolved.
I suspect 99% of folks using this tool and complaining on this thread are programmers . Instead of whining, put your time where your mouth is and submit a PR.
I just decided to use pipenv instead of pip for a small project. First thing I did was
pipenv install -r requirements.txt
. It’s been locking the dependencies for about 10 minutes now. Therefore, I’m gonna go back to pip.Guys, this issue is costing you a lot of users. I propose to address it quickly.
Thanks for your insightful feedback.
I’m thankful for the time the developers of this project are spending on this, but I suggest that it should be warned in bold that this project is not yet production ready right above the user testimonials in
README.md
, currently it’s misleading people to spend precious time to replace their current pip/virtualenv stack with pipenv until they find out about this slow locking and they understand they can’t use it.I really like
pipenv
but not as much as I like my bandwidth and time. So I end up solving the issue using:Wish the developers best of luck…
Guess I’ll give https://github.com/sdispater/poetry a shot 😐
Hello @idvorkin,
I’ve tried once. It took weeks to achieve merging of the trivial fix. Just compare the amount of discussions with the actual fix size.
I definitely do not want to submit any more fixes to this project.
So your advice is not as viable as you can assume.
@Jamim on behalf of the many users (and I suspect the admins as well), thank you for your contributions. I read your PR, and I could empathize with the frustration. However, I have to agree w/ @techalchemy on this one:
I’ve never met the admins, but if they’re anything like me (and maybe you) they are humans with busy lives whose lives are packed with commitments even before they have energy to spend on this project.
Similarly, I bet if you (or anyone else) fixed the performance problem, you’d have slews of people who’d help you develop, test, merge it, or if required (and I highly doubt it would be) create a fork.
@Jamim , thanks for suggesting Poetry. Personally, for some reason I did not come across it. After reading its readme it seems worth trying. It lists some benefits over Pipenv as well (https://github.com/sdispater/poetry/#what-about-pipenv).
Having said that, the project being dead is a gross overstatement, and if I were in pipenv authors’ shoes, I would find it disrespectful. The author replied in the issues section just yesterday. It’s just this locking issue being overlooked, probably because it is hard to fix.
Have we established somewhere that there are redundant downloads happening btw? I suspect that is the case but proving it would be really helpful
FYI comparing
pip install -r requirements.txt
to the time it takes to lock a dependency graph is not going to be informative as a point of comparison. Pip doesn’t actually have a resolver, not in any real sense. I think I can describe the difference. When pip installs yourrequirements.txt
, it follows this basic process:This turns out to be pretty quick because pip doesn’t really care if the dependencies of package 1 conflicted with the dependencies of package 3, it just installed the ones in package 3 last so that’s what you get.
Pipenv follows a different process – we compute a resolution graph that attempts to satisfy all of the dependencies you specify, before we build your environment. That means we have to start downloading, comparing, and often times even building packages to determine what your environment should ultimately look like, all before we’ve even begun the actual process of installing it (there are a lot of blog posts on why this is the case in python so I won’t go into it more here).
Each step of that resolution process is made more computationally expensive by requiring hashes, which is a best practice. We hash incoming packages after we receive them, then we compare them to the hashes that PyPI told us we should expect, and we store those hashes in the lockfile so that in the future, people who want to build an identical environment can do so with the contractual guarantee that the packages they build from are the same ones you originally used.
Pip search is a poor benchmark for any of this, in fact any of pip’s tooling is a poor benchmark for doing this work – we use pip for each piece of it, but putting it together in concert and across many dependencies to form and manage environments and graphs is where the value of pipenv is added.
One point of clarification – once you resolve the full dependency graph, installation order shouldn’t matter anymore. Under the hood we actually pass
--no-deps
to every installation anyway.As a small side-note, pip search is currently the only piece of pip’s tooling that relies on the now deprecated XMLRPC interface, which is uncacheable and very slow. It will always be slower than any other operation.
Sure, it is annoying waiting for a lock when having to do multiple installs mid dev session, but this can be managed.
The important thing is that a lock file is generated before pushing local changes to repo. I make judicious use of the
—skip-lock
flag during dev sessions, andpipenv lock
once at the end before I commit.Locking numpy (and nothing else) takes 220 s on my machine (see below). Most of the time seems to be spent downloading more than 200MB of data, which is quite puzzling given that the whole numpy source has 4 MB. Though clearly even if that was instant, there’s still 25 s of actual processing, and even that seems excessive to calculate a few hashes. Subsequent locking, even after deleting Pipenv.lock, takes 5 s.
Here’s what I hope is a reproducible test case https://github.com/Mathspy/tic-tac-toe-NN/tree/ab6731d216c66f5e09a4dabbe383df6dc745ba18
Attempting to do
pipenv install
in this lock-less repository have so far downloaded over 700MBs or so while it displayedLocking [packages] dependencies...
Will give up in a bit and rerun with
--skip-lock
until it’s fixedI want to draw attention to this excellent comment from #1914 on the same topic https://github.com/pypa/pipenv/issues/1914#issuecomment-457965038 which suggests that downloading and executing each dependency is not necessary any longer.
I wonder if any devs could comment on the feasibility of this approach.
@bochecha my statement may be hyperbole in your opinion but it’s a fact based on my experience, I heard about pipenv from some coworkers, today I tried to update an old project, updating its dependencies, etc I thought lets update from pip/virtualenv to pipenv as part of the update process. I had to update a dependency, check how things work with it, update parts of code if needed and then update another dependency, each time I ran
pipenv install <something>
I had to wait a ridiculously long time, first I thought it’s calculating something and it’ll cache it for future as I couldn’t believe it’s a problem in a claiming to be production ready package manager. After installing ~10th package I started searching about it and I found this thread, I removedPipfile
andPipfile.lock
and went back to my pip/virtualenv workflow. I was tempted to try poetry but I couldn’t risk another hour.This things happen in JS community for example but I don’t expect it in Python community, we don’t have this kind of problems in our community and we should try to avoid it, a disclaimer in
README.md
can avoid this inconvenience so I suggested it in my comment. It could save my time today and I think it’ll save time for other newcomers and they won’t have a bad experience with this project so they may stay as potential future users.I find it install quickly and locks slowly, so as soon as you get the
Installation Succeeded
message your good to continue working… unless you want to install something else…I’ve noticed that it’s actually faster to remove the environment and recreate it from scratch to update the lockfile. This is true both for running
pipenv lock
andpipenv install some-package
How is this not the main priority for this project? Pipenv is so slow its pretty much unusable. An not only in some uncommon side uses cases its always super slow.
@ravexina thanks for the suggestion, I’ll try for sure
Hi @yssource and everyone,
This project seems to be dead, so if you want to eliminate the speed issue please consider migrating to Poetry which is significantly faster.
In my case installing the dependencies on the server hangs the server for hours. I’m using AWS EC2 instance
t2.micro
with 1 GB RAM. This much RAM is enough for a single application with few dependencies but the installation take all the memory and there is only one way to get it to work by restarting the server.This issue is pending for so long years and no fix has been made for this. I see multiple issues being closed without any resolution.
Should
install
be preforming lock anyway, seeing thatlock
is already a separate command? In the meanwhile theinstall
option description should specify that locking also takes place, and maybe even recommend--skip-lock
.Also, how about pinning this issue?
Pipenv is a really wonderful tool and I used to recommend it, but a project with 8 modules can’t lock… it just times out. There doesn’t seem to be any interest in solving this issue and that is very frustrating. I read you can get dependencies without downloading from pypy now, is that a workaround for this issue? Don’t see any talk about that option here. At the moment the tool is unusable for my purposes.
really painful at times, I am installing PyPDF2 and textract; pipenv took ~10 mins to lock.
I believe it was closed because there were many conflicting reports that could have been separate issues. Some of the issues may be resolved by time of the latest release, but others may still be relevant - in which case we should open new issues to track those issues.
Try out py-poetry - more features and faster locking.
I kinda agree with sassanh. Not everyone is equally affected by the issue but some of us were affected pretty bad. I have made open source projects that were not really fully functional or production ready and when it was the case I put a disclaimer on it so I don’t waste people’s time if they are not ready for the bumps.
I am not mad at the the people who work on this project but I am kinda mad at the person who made a public talk about it, selling it as a great tool with 0 disclaimer. As a result, I wasted quite a lot of my precious time trying to make a tool work, hoping to save time in the long run, but I ended up having to go back to pip and my own script, because pipenv didn’t work in my time and bandwidth constrained environment.
Can an admin kindly close this thread to comments? It looks like no helpful additional content is being added to the discussion.
I’d be happy to subscribe a ticket tracking the work towards fixing the issue.
Thanks!
Any news on this? Any way to help? dupe of https://github.com/pypa/pipenv/issues/1914
/ edit: btw, why does
pipenv install
update the versions in the lockfile? o.Ò I just ran it after locking timed out and now that I look at the new lock file I see pandas was updated from 0.23.4 to 0.24.0, numpy from 0.16.0 to 0.16.1, etc… Didn’t expect that to happen unless I didpipenv update
…After watching one of the talks from the creator, I decided to use
pipenv
. But it is too slow.@JoshuaPoddoku This issue it was determined that the current design uses hashes from the API and so there should not be redundant downloads in performing locking, much has changed since the original report was opened. When looking for PRs that seemed relevant I performed this search: https://github.com/pypa/pipenv/pulls?q=is%3Apr+is%3Amerged+hash
However I don’t have a link to an exact PR, but frostming above had said:
Basically @meichthys has it correct that any thing still relevant to the latest version of
pipenv
should be reported as new issues should it not match an existing open issue. Closing this was part of a larger effort to evaluating/triaging the backlog of issues to achieve a more manageable and accurate backlog for this project.I don’t believe this exact issue is still relevant to latest versions of pipenv, such as the one released yesterday
pipenv==2022.1.8
. I am closing this multi-year issue (that is kind of hard to parse) in hopes that we can get any new detailed bug reports that pertain to the latest version of pipenv filed instead.It’s sad to see all the bashing on this issue, constructive ideas make progress! This might not work for everyone, but this workaround seems to speed things up a bit:
pipenv --rm && pipenv install
. If that doesn’t help, you can try to remove the lock file before running the above command.I wonder which version of Pipenv are you using, generating hashes no longer downloads the artifacts, as long as SHA256 hash is included in the URL(True for most packages on pypi.org). So what package index are you using?
Such a nice tool getting neglected. I will be unsubscriing from this issue as I see it will never get resolved. Will be sticking to something like conda or do it manually using virtualenv.
Most likely @AlJohri, also any info about running processes / locks / io would help
I am not interested in spending energy on this discussion with random shots in the dark. Please provide a reproducible test case.
Numpy should be substantially faster now (I have been using your example as a test case in fact!). As of my most recent test, I had it at ~30s on a cold cache on a vm.
Can you confirm any improvements with the latest release?
I’m not a contributor to the project and at the moment I don’t know all the specifics, but my understanding is that the locking phase is where all of the dependencies get resolved and pinned. So if you have one top-level package with ~65 dependencies, it’s during the locking phase that all of the dependencies of that first package are (recursively) discovered, and then the dependency tree is used to resolve which packages need to be installed and (probably) in what rough order they should be installed in. Not as sure about the last part.
If you pip install from a Pipfile without a lockfile present, you’ll notice that it does the locking phase before installing the packages into the venv. Similarly if you have a lockfile but it’s out of date. I suspect having a lockfile and installing using the
--deploy
option would be faster, as would the--no-lock
option; in the former case you get an error if the lockfile is out of date, in the latter you lose the logical splitting of top-level packages (declared environment) and the actual installed (locked) environment of all packages. At least this is how I understand it.Whether or not pipenv uses pip under the hood - I think it does - it still needs to get the information from the pypi server(s) about package dependencies and the like, so my question about pip search was more a proxy for how fast or slow your path to the pypi server is than a direct implication about the mechanism by which pipenv does its thing.
An interesting experiment might be to compare the time required for locking the dependency tree in pipenv, and installing requirements into a new venv using
pip install -r requirements.txt
. I think they should be doing pretty similar things during the dependency resolution phase.