pip: pip install of a directory is super slow
See https://github.com/pypa/pip/issues/2195#issuecomment-524606986, for a summary of this issue.
I am dubious of why pip needs 17 seconds to process a local directory that is not on NFS (in fact, it’s on an SSD drive) for pip, which has no dependencies, since everything is vendored.
$ time pip install --no-install ~/dev/git-repos/pip
DEPRECATION: --no-install and --no-download are deprecated. See https://github.com/pypa/pip/issues/906.
Processing /Users/marca/dev/git-repos/pip
Requirement already satisfied (use --upgrade to upgrade): pip==6.0.dev1 from file:///Users/marca/dev/git-repos/pip in /Users/marca/dev/git-repos/pip
pip install --no-install ~/dev/git-repos/pip 2.80s user 5.86s system 50% cpu 17.205 total
It should probably at least be logging whatever is taking that long, but maybe it shouldn’t even be doing whatever it’s doing.
Note that the “Processing” line appears right away and pretty much the whole delay seems to be between that line and the next one.
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Reactions: 24
- Comments: 77 (37 by maintainers)
Commits related to this issue
- Speed up unpack_file_url by ignoring .tox, .git, .hg, .bzr, and .svn when doing `shutil.copytree` in unpack_file_url in pip/download.py. Fixes: GH-2195 — committed to msabramo/pip by msabramo 10 years ago
- Extract _copy_dist_from_dir from unpack_file_url Right now it's just a pretty simple `shutil.copytree`, but ideally we want it to do something more complex, involving building an sdist. Plus, this m... — committed to msabramo/pip by msabramo 9 years ago
- Speed up unpack_file_url E.g.: `pip install /path/to/dir` by building an sdist and then unpacking it instead of doing `shutil.copytree`. `shutil.copytree` may copy files that aren't included in the ... — committed to msabramo/pip by msabramo 9 years ago
- BUG: finish fix for speeding up "pip install .". This is a follow-up to gh-2535, which added the code to copy via (sdist + unpack) instead of shutil.copytree, but forgot to actually call that functio... — committed to rgommers/pip by rgommers 9 years ago
- Copy setup.py into place too. This works around a problem with the new sdist-based "pip install .": * when creating the sdist, we don't run a literal "setup.py sdist" * instead, sys.argv[0] is a com... — committed to warner/pip by warner 8 years ago
- setup.py: was failing to install jmdictdb.pylib setup.py was changed in 200625-8ea3fd9 to explicitly specify the packages to be installed rather then using find_packages() in an attempt to diagnose w... — committed to dedyk/jmdictdb_mirror by deleted user 4 years ago
Narrator: it didn’t.
FYI for those who are running into this issue – A workaround is to replace
pip install .
with:It is making a copy of the entire directory, including
.git
. It probably shouldn’t be doing that, no.wondering if there’s any progress on this? not only are
.git
or any.${scm}
folders troublesome, it is much worse if people include.vagrant/
along with the source.having a
.pipignore
that is customizable would really help ease the pain.We have now (per #7951) published a beta release of pip, pip 20.1b1. This release includes #7882, which implemented a solution for this issue.
I hope participants in this issue will help us by testing the beta and checking for new bugs. We’d like to identify and iron out any potential issues before the main 20.1 release on Tuesday.
I also welcome positive feedback along the lines of “yay, it works better now!” as well, since the issue tracker is usually full of “issues”. 😃
To reiterate and summarize:
Now, going this route also fixes a bunch of other usability issues around pip’s building mechanics for users.
I’ve started a self-motivated project to refactor pip’s build logic. While I won’t be tackling this issue as a part of my refactoring work, I am more than willing to help someone who is inclined enough to try to fix this issue – the fix would be fairly involved in pip’s build logic, which isn’t the most straightforward bit of code around and there might be tricky edge cases that we only notice during implementation.
For christ sake, guys, can this be solved somehow already, please? I mean, there seems to be some consensus that this behavior is braindead - yet the ticket is open for three years by now, and there is no solution in sight. I hate having to do manual move data in and out of my tree just so that pip does not barf or hang for some minutes (I have to work on shared filesystems).
If there is no consensus on how to not break existing work, can a solution like
.pipignore
be provided as an opt-in, maybe? I am don’t mind jumping through some hoops to get this fixed.I will say that it is considerably better.
Old:
noglob pip3 install . 3.76s user 2.51s system 12% cpu 50.245 total
New:
noglob pip3 install . 3.40s user 0.70s system 42% cpu 9.764 total
This should be resolved by #7882 (build local directories in place).
Is this still being looked into? It’s a very painful surprise to see multiple gigabytes of git-ignored debug-data dumps being copied over during a
pip install .
down from ~2 minutes to 4 seconds, thank you so much!
We are still hurting quite badly because of this issue, too. It is really difficult to tell our users that they cannot keep code and experimental data (which are large) in the same directory - it’s quite counter-intuitive. On our own systems, we use the
.pipignore
patch, but don’t have the ability to deploy that on the majority of systems we support… 😕Now that
in-tree-build
is available, should we close this?It can. The reason for reverting the change was that we didn’t have any opt-outs or a period for getting feedback on the change. We do have new flags to help facilitate that (–use-feature and --deprecated-feature), but someone has to reimplement/reintroduce the functionality in this context now.
Broadly, I think what we want to do here is:
Works great/faster for me! 👍
This issue still persists, because the directory I’m instaling from has maybe 10 mb of python code, but then a lot of json data files and
.git
.Actually, no - #4900 provided an implementation which solves the problem with little code in a backward compatible way. It might not solve other problems - but given the age of this ticket, I would like to ask to reconsider that approach.
We run into this https://github.com/pypa/pip/issues/2195#issuecomment-351258913 today as well. It’s still happening.
I’m running into this with the latest developer version of pip - I thought PEP 517 support was added in pip 19, so should this still be happening?
In my case because I work on a project (astropy) where I have many remotes and branches, my .git directory is 1.8Gb, and it takes minutes to copy this over to a temporary directory. It seems like it would make more sense to construct a source distribution first then build the wheel from there, behind the scenes.
Could try «in-tree build» (similar to «in-tree PEP 517 backend») or «build in source dir»
We recently had a long discussion on what editable install means, and I think we actually landed in a place that is more along the lines of
machine local
as far as pip goes. But pip is unaware of where and how on the local machine and is the build backends job to define and handle that.@pradyunsg thanks for the update. Some feedback on terminology (please feel free to ignore, just FYI): this sentence, as well as gh-7555, confused me because pip does not do in-place builds. What in-place builds has always meant is
python setup.py build_ext --inplace
(orpython setup.py develop
).Here you changed the meaning to: “build without copying to a tmpdir”. Extension modules still don’t end up in-place, they end up in a
build/
dir that’s usually easily cleaned up. It would be nice to be a little more explicit in for example gh-7555.I know there’s some tools that rely on
.git
, but is anyone relying onbuild
being copied? That’d be nice to add to the ignored dirs, happy to send a PR if you agree.workaround: use a shallow clone (change depth to suit):
Fixing this requires installing via sdist, and last time we discussed that, there was a lot of pushback from people using tools that (apparently) need the actual source directory. Personally I think we should bite the bullet and deprecate build processes that don’t give the same results when you do
build_sdist
thenbuild_wheel
as you get when you just dobuild_wheel
, but I don’t have the time or energy to champion that proposal myself at the moment.I hope this has been mentioned before in this thread.
A better workaround would be to build an sdist or a wheel directly using setup.py and installing the generated artefact using pip. That way, pip won’t be doing the directory-copy stuff (because it got a file to install from) and this is the exact same result as you would with
pip install .
(as of pip 9), minus the directory copy.Except, for testing, that doesn’t test what a user installing the package from pypi would be using. (e.g. packages and modules that don’t get packaged will still be available)
A good workaround seems to be using an editable install (
pip install -e $DIR
).Have .pipignore implicitly include .git, .hg etc. with empty .pipignore supressing this.
This should be reopened since #2196 was reverted. I’d like to come with an alternative PR that builds an sdist instead of using heuristics to figure out what to copy. See the comments on that PR for details.