pip: the problem with replacing dependency links with PEP 508 URL requirements

My 2 cents: PEP 508 URLs are no replacement for dependency links, because of the lack of version specifiers.

For example, with dependency links, you can push a package on PyPI, with dependencies on other PyPI project, but the option to use a patched version for some of those dependencies (with some extra bug fixes) when using --process-dependency-links. I’ve use that myself on a project depending on PyObjc because the delay between releases is so long.

Additionally, the lack of version specifiers mean there’s no way for pip to know if a existing installation is compatible or not: this is problematic when upgrading a project depending on another through a PEP 508 direct URL, but also make sharing a such a dependency between projects problematic.

And finally, dependency links are opt-in, and usable on PyPI. But PEP 508 URLs are forbidden by pip during install for projects originating from PyPI: for “security reasons”. This, to me, does not really make sense: it’s not like installing from PyPI only is secure!

That last point could be addressed by changing the behaviour in pip (maybe a --allow-url=regexp option?), but I don’t see a way around the lack of version specifiers. Could the PEP be amended to allow package >= 10.0 @ url?

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Reactions: 19
  • Comments: 27 (9 by maintainers)

Commits related to this issue

Most upvoted comments

What’s the status of this issue? I was really surprised that dependency links were removed without addressing this issue first. Version specifiers are really important. Any news?

Here’s a workaround for anyone interested (yes it’s ugly)

setup.py

import sys
import subprocess

git_rpmfile = 'git+https://github.com/pdxjohnny/rpmfile@subfile_close'

try:
    import rpmfile
    # ... do some version checking ...
except (ModuleNotFoundError, ImportError):
    if '--user' in sys.argv:
        subprocess.run([sys.executable, '-m', 'pip', 'install', '--upgrade',
            '--user', git_rpmfile], check=False)
    else:
        subprocess.run([sys.executable, '-m', 'pip', 'install', '--upgrade',
            git_rpmfile], check=False)

setup(
    name='

Another example:

https://github.com/odwdinc/SSBU_Amiibo/blob/0ffe836f61fb91e3fb878a92943720dd86edf932/setup.py#L16

os.system('pip install --user git+https://github.com/odwdinc/pyamiibo@master')

To add some use cases for this feature.

  • Non-open source python packages depending on other non-open source packages. It would be great for these to be able to have pip install the internal dependencies via their git urls, etc.
  • A maintainer of a python package isn’t as fast as your package at releasing, you need your CI to install branch your pull request it based on so your testing doesn’t wait on another package.

Is the answer going to be that packages on pypi have to only depend on other packages on pypi?

No, but the end user needs to explicitly opt into any other index being used. This was a deliberate policy decision, to prevent malicious code on PyPI triggering download of code from other, arbitrary, locations.

If malicious code is already on PyPi. Why would it need to pull in code from other locations? Why not just push more malicious code to the same package? Or to another package and push that to PyPi?

I think the desire to make pip more secure is great. But I don’t think the mitigation that was taken is effective, (see above os.system workaround). It seems to have broken many users workflows without providing an increase in security. I think it might be time we look at reversing the deprication of dependency links. What would the process be for that?

I’d like to cover one of the use cases.

Suppose the project is developed in-house and consists of several python packages. The “main” python package depends on the others. All packages are stored in the private repos.

During the development, the main package is installed via pip install git+https://company.local/repos/main-package.git. This command also installs other (private) packages, whose VCS specifiers are set in main-package/setup.py. This solution does not require the Python Package Index nor requirements.txt, and is easy to setup and maintain.

Once the development is over, the project is achieved (via pip download -d ./downloads git+https://company.local/repos/main.git, which downloads the dependencies also) for possible future offline install and private repos are deleted. Having a simple way to specify a link for dependencies would make it possible to support offline install with command pip install --no-index --find-links=./downloads main-package effortlessly.

Currently, the last command tries to access URLs in VCS specifiers, because --find-links does not support them. Alternative solution setup(dependency-links=...) is deprecated. The only working solution I know is pip install --no-index --find-links=./downloads -r requirements.txt, with requirements.txt containing package names (no URLs).

@stinovlas Just curious what you meant by “private packages that depend on each other in not-trivial way”? Does “git+ssh://git@github.com/…” not work under install_requires?

It does work. But, you can only depend on one specific commit-ish (i.e. a branch). You can’t say “I want version >= 3.2 from this repository.”.

So, I started this thread on distutils-sig. I raised this issue and discussed with the developers. In the end, I was convinced that changing PEP 508 to include version specifiers is both impractical and unnecessary. I’d like to explain while I reached this conclusion to other pip users:

  • It’s not clear what would these version specifiers refer to.
    • It’s virtually impossible to determine Python package version without actually building it and when you have a repository, you’d need to potentially build package at every single commit.
    • There are git tags, but not all tags conform to PEP 440, therefore, tag does not have to be valid version specifier. Even if it is, the tag doesn’t have to correspond with actual Python package version. On top of that, repository tags may be removed or overwritten.
  • PEP 508 URLs were not meant to replace all functionality of dependency_links, but rather preserve those parts of dependency_links that are deemed useful and not dangerous.
    • URL dependencies are meant as temporary fixup for packages, that are not yet released on package index. You can point to specific commit-ish and that’s actually enough.
  • If you have private packages that depend on each other in not-trivial way, setup a package index.
    • devpi has been mentioned as easy-to-use solution.
    • URL dependencies are not all-mighty. Package index is a robust solution for complex problems.

No, but the end user needs to explicitly opt into any other index being used. This was a deliberate policy decision, to prevent malicious code on PyPI triggering download of code from other, arbitrary, locations.

Thank you for explaining the history there. I can understand the reasoning. But I don’t think it was that effective, because nothing stops malicious or just vulnerable code from ending up on PyPI:

Using the PEP 508 URL format I can make packages, even ones on PyPI, depend on arbitrary outside locations. For example:

setup(
  # ....
  install_requires=[
    # ...
    "requests@git+https://github.com/kousu/requests@google-surveillance",
    # ....

For pure-python source packages this works every time. It would be helpful if that worked for wheels too.

And there’s another way to circumvent user opt-in: you can hide the --extra-index-url or --find-links in a requirements.txt:

https://github.com/neuropoly/spinalcordtoolbox/blob/b64cad3c846fd6bd7a557688b67b80fe0b2c6dc2/requirements.txt#L26

-f https://download.pytorch.org/whl/cpu/torch_stable.html
torch==1.5.0+cpu; sys_platform != "darwin"
torch==1.5.0; sys_platform == "darwin"
torchvision==0.6.0+cpu; sys_platform != "darwin"
torchvision==0.6.0; sys_platform == "darwin"

pip install -r requirements.txt doesn’t prompt the user to ask if they are okay with using an unvetted source.

This is all pretty inconsistent and confusing 😕.

In practice it just means, in order for devs to minimize the headache we give our users, that we’ll write scripts like this. Our users don’t know what pip is or who runs pypi. I don’t even know who runs pypi, but I assume they’ve got it in hand. Our users definitely haven’t thought through the implications of contacting this domain vs that domain.


I’m sorry for complaining. I know pypa is a big project and this is one more straw on the back. I think you’re doing good work shaping all this clay! And that it’s a lot to consider!

Here I want to make sure this one use case isn’t forgotten: you support getting source packages from arbitrary URLs, so please also support binary packages the same way.

I have a setup like this that is now broken:

  • Package first deployed on some http://local-simple-index/simple repo.
  • It has a dependency to a package second which is deployed on http://cdn.somerepo.com/simple. Due to reasons, the package second is also present on pypi, but I need my own version.
  • first has dependency_links in the setup.py that contain a link to http://cdn.somerepo.com/simple.

When I install package first like this: pip install --extra-index-url=http://local-simple-index/simple first>=0.1dev it tries to install it, and proceeds to install dependency second right from pypi ignoring dependency_links of the setup.py.

I know there’s PEP508. How do I tell pip to get the second from http://cdn.somerepo.com/simple without dependency_links (rest in pieces)? Setting up ~/.pip/pip.conf is not an option, due to multiple reasons.