pip: Splitting a package into two packages leads pip to corrupt the packages when upgrading to the new versions
(Forgive me if this is a duplicate… It seems like a problem that should have been reported in the past [I’ve heard that the transition from docker-py to docker resulted in similar problems]. I couldn’t find any search terms that found an existing bug report so I’m opening a new one)
Environment
- pip version: 20.1.1
- Python version: 3.7
- OS: Fedora Linux
Description The ansible-2.9.x package is one large, monolithic package. ansible-2.10.x splits this package into two:
- an ansible-base package which contains the executables (not entrypoints, if that matters) and the site-packages/ansible library with most of the files which were in ansible-2.9.x.
- an ansible package which contains all of the addons to the ansible-base package. These are installed into site-packages/ansible_collections.
The ansible-2.10.x package has a dependency on the ansible-base packages. This way people who only need the minimal functionality can pip install ansible-base
(>=2.10.0b1). People who just want to get the same experience as ansible-2.9.x can pip install ansible
(>=2.10.0a1) .
A clean install works fine:
pip uninstall ansible
pip install ansible==2.10.0a1
# Results in ansible-base-2.10.0b1 and ansible-2.10.0a1 installing correctly.
# Looking at the filesystem will find that all of the correct files are present.
# ansible --version will work
The problem is upgrades:
pip install ansible==2.9.10
pip install ansible==2.10.0a1
# Results in ansible-base-2.10.0b1 and ansible-2.10.0a1 being installed but with missing files
# The files which were in ansible-2.9.10 are erased from the filesystem,
# leaving ansible-base-2.10.0b1 in a broken state
# The reason is that pip first installs ansible-base-2.10.0a1 and then, when it uninstalls ansible-2.9.10,
# it doesn't realize that the ansible-base files have overwritten the same named files in ansible-2.9.10
# and that they should not be removed.
Expected behavior
I expect that upgrading the package will lead to a system where both new packages are installed with all of the files that the new packages include.
How to Reproduce
- python3.7 -m pip install --user ansible==2.9.10
- python3.7 -m pip install --user ansible==2.10.0a1
- ls -al ~/.local/bin/ansible
- ls: cannot access ‘/home/badger/.local/bin/ansible’: No such file or directory
- ls -al ~/.local/lib/python3.7/site-packages/ansible/init.py
- ls: cannot access ‘/home/badger/.local/lib/python3.7/site-packages/init.py’: No such file or directory
Output
[pts/145@peru /var/tmp/testing]$ python3.7 -m pip install --user ansible==2.9.10 (08:10:37)
Processing /home/badger/.cache/pip/wheels/bb/e6/9a/05f0b546d96bc1da05865504c3481a4fd3a1b3fd48a38d53a1/ansible-2.9.10-py3-none-any.whl
Requirement already satisfied: PyYAML in /home/badger/.local/lib/python3.7/site-packages (from ansible==2.9.10) (5.3.1)
Requirement already satisfied: jinja2 in /home/badger/.local/lib/python3.7/site-packages (from ansible==2.9.10) (2.11.2)
Requirement already satisfied: cryptography in /home/badger/.local/lib/python3.7/site-packages (from ansible==2.9.10) (2.3.1)
Requirement already satisfied: MarkupSafe>=0.23 in /home/badger/.local/lib/python3.7/site-packages (from jinja2->ansible==2.9.10) (1.1.1)
Requirement already satisfied: cffi!=1.11.3,>=1.7 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible==2.9.10) (1.11.5)
Requirement already satisfied: idna>=2.1 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible==2.9.10) (2.8)
Requirement already satisfied: asn1crypto>=0.21.0 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible==2.9.10) (0.24.0)
Requirement already satisfied: six>=1.4.1 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible==2.9.10) (1.15.0)
Requirement already satisfied: pycparser in /home/badger/.local/lib/python3.7/site-packages (from cffi!=1.11.3,>=1.7->cryptography->ansible==2.9.10) (2.19)
Installing collected packages: ansible
Successfully installed ansible-2.9.10
[pts/145@peru /var/tmp/testing]$ python3.7 -m pip install --user ansible==2.10.0a1 (08:11:10)
Processing /home/badger/.cache/pip/wheels/1a/52/25/c9e776b2df588061cd9dc831740416edb7e276f5ffe7a8d8d3/ansible-2.10.0a1-py3-none-any.whl
Processing /home/badger/.cache/pip/wheels/dd/db/f0/0890f4f13dd6446092ba5d76f55d62528b52ca6ab74ea871d4/ansible_base-2.10.0b1-py3-none-any.whl
Requirement already satisfied: jinja2 in /home/badger/.local/lib/python3.7/site-packages (from ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (2.11.2)
Requirement already satisfied: cryptography in /home/badger/.local/lib/python3.7/site-packages (from ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (2.3.1)
Requirement already satisfied: packaging in /home/badger/.local/lib/python3.7/site-packages (from ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (20.4)
Requirement already satisfied: PyYAML in /home/badger/.local/lib/python3.7/site-packages (from ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (5.3.1)
Requirement already satisfied: MarkupSafe>=0.23 in /home/badger/.local/lib/python3.7/site-packages (from jinja2->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (1.1.1)
Requirement already satisfied: asn1crypto>=0.21.0 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (0.24.0)
Requirement already satisfied: idna>=2.1 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (2.8)
Requirement already satisfied: six>=1.4.1 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (1.15.0)
Requirement already satisfied: cffi!=1.11.3,>=1.7 in /home/badger/.local/lib/python3.7/site-packages (from cryptography->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (1.11.5)
Requirement already satisfied: pyparsing>=2.0.2 in /home/badger/.local/lib/python3.7/site-packages (from packaging->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (2.4.7)
Requirement already satisfied: pycparser in /home/badger/.local/lib/python3.7/site-packages (from cffi!=1.11.3,>=1.7->cryptography->ansible-base<2.11,>=2.10.0.dev1->ansible==2.10.0a1) (2.19)
Installing collected packages: ansible-base, ansible
Attempting uninstall: ansible
Found existing installation: ansible 2.9.10
Uninstalling ansible-2.9.10:
Successfully uninstalled ansible-2.9.10
Successfully installed ansible-2.10.0a1 ansible-base-2.10.0b1
[pts/145@peru /var/tmp/testing]$ ls -al ~/.local/bin/ansible (08:12:13)
ls: cannot access '/home/badger/.local/bin/ansible': No such file or directory
[pts/145@peru /var/tmp/testing]$ ls -al ~/.local/lib/python3.7/site-packages/ansible/__init__.py (08:14:19)
ls: cannot access '/home/badger/.local/lib/python3.7/site-packages/ansible/__init__.py': No such file or directory
[pts/145@peru /var/tmp/testing]$ ls -al ~/.local/lib/python3.7/site-packages/ansible/modules/ping.py (08:14:29)
-rw-r--r--. 1 badger badger 2090 Jun 28 08:11 /home/badger/.local/lib/python3.7/site-packages/ansible/modules/ping.py
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 7
- Comments: 19 (16 by maintainers)
Uh, it’s not that simple, since file-overwriting is a needed feature for
pkgutil
- andpkg_resources
-style namespaces. There are other (admittedly obscure) use cases, but legacy namespace packages alone is a good enough reason pip cannot ban overwriting easily, at least not before dropping Python 2 support and a long deprecation period allowing people to migrate.Now that we dropped Python 2 (for a while now!) maybe it’s time to revisit this. There are two routes to take here:
Option 1 is obviously simpler, but also more restrictive and would potentially break existing use cases and cause user frustration.
@uranusjr I’m convinced that the better fix would be implementing transactions and splitting the removal stage from the installation. This way, when it’s pre-calculated which packages need to be removed from disk, they all could be deleted first and only after that, the new package versions would be placed there.
@bmillemathias yes, but the previous repro doesn’t fully demonstrate the problem anymore. And so @fbidu has correctly posted the new reproducer for whoever attempts to dig into it in the future.
The reason for this will be the sequence of events pip goes through:
Step 3 will remove stuff that’s been moved from ansible to ansible-base.
The easiest workaround, as you’ve discovered, is to uninstall and do a clean install.
Pip can’t easily address this with its current architecture, as it treats each install independently. The only approach I can see that might work is to do all the uninstalls first, then the installs. And keep everything as one transaction, so a failed install in one package can back everything out cleanly. But my gut feeling is that this would be quite a major change.
Hello all, a quick FYI - the current way to reproduce this bug is
Because of ansible/ansible#70529
I think the first step here would be to get pip to error out on file conflicts. This seems fairly easy, just read
RECORD
and check if any of the files are already on the system. @pfmoore is this still difficult to integrate given pip’s architecture?Cool. So yes, we should be able to add a simple transaction concept where uninstall of everything in the transaction is the first step and then install is the second step. (i say simple in comparison to system package manager’s transaction concept but I do acknowledge that it’s a big change from what pip currently does).