pip: Sorting TypeError in move_wheel_files() during install (e.g. Poetry)
Environment
- pip version: Latest
- Python version: 3.6.2
- OS: Fedora 25
Description
I’m getting the exception:
TypeError: '<' not supported between instances of 'int' and 'str'
when attempting to install poetry (example dockerfile and full stacktrace below).
Looking at the code for pip, in move_wheel_files
it calls sorted(outrows)
which is sorting a tuple. The 3rd column for that tuple looks like it could be an int or string, which is a bug:
>>> sorted((('','',''),('','',1)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'int' and 'str'
so the code in pip thats relevant:
outrows = []
for row in reader:
row[0] = installed.pop(row[0], row[0])
if row[0] in changed:
row[1], row[2] = rehash(row[0])
outrows.append(tuple(row))
for f in generated:
digest, length = rehash(f)
outrows.append((normpath(f, lib_dir), digest, length))
for f in installed:
outrows.append((installed[f], '', ''))
for row in sorted(outrows):
writer.writerow(row)
as you can see, the for f in installed
will always place a string in the 3rd column of the tuple, however, the paths that use rehash put length
in the 3rd column which looking at the code for rehash, will always be an integer:
def rehash(path, blocksize=1 << 20):
"""Return (hash, length) for path using hashlib.sha256()"""
h = hashlib.sha256()
length = 0
with open(path, 'rb') as f:
for block in read_chunks(f, size=blocksize):
length += len(block)
h.update(block)
digest = 'sha256=' + urlsafe_b64encode(
h.digest()
).decode('latin1').rstrip('=')
return (digest, length)
Expected behavior
Not error out on sorting
How to Reproduce
Example dockerfile that reproduces the issue:
FROM fedora:25
RUN dnf -y update
RUN dnf -y install python36
RUN bash -c "curl https://bootstrap.pypa.io/get-pip.py | python3.6"
RUN bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"
Output
$ docker build -t poetry_fail .
Sending build context to Docker daemon 554kB
Step 1/5 : FROM fedora:25
---> 9cffd21a45e3
Step 2/5 : RUN dnf -y update
---> Using cache
---> 9afadb458128
Step 3/5 : RUN dnf -y install python36
---> Using cache
---> 7291182779ff
Step 4/5 : RUN bash -c "curl https://bootstrap.pypa.io/get-pip.py | python3.6"
---> Using cache
---> debf1591f40a
Step 5/5 : RUN bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"
---> Running in 04bb671e7e1e
Retrieving metadata
Installing version: 0.11.5
- Getting dependencies
- Vendorizing dependencies
- Installing poetry
An error has occured: Command '('/usr/bin/python3.6', '-m', 'pip', 'install', '--upgrade', '--no-deps', '/tmp/poetry-installer-t434oaug/poetry-0.11.5-py2.py3-none-any.whl')' returned non-zero exit status 2.
Processing /tmp/poetry-installer-t434oaug/poetry-0.11.5-py2.py3-none-any.whl
Installing collected packages: poetry
Exception:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 143, in main
status = self.run(options, args)
File "/usr/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 366, in run
use_user_site=options.use_user_site,
File "/usr/lib/python3.6/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
**kwargs
File "/usr/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 760, in install
use_user_site=use_user_site, pycompile=pycompile,
File "/usr/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 382, in move_wheel_files
warn_script_location=warn_script_location,
File "/usr/lib/python3.6/site-packages/pip/_internal/wheel.py", line 514, in move_wheel_files
for row in sorted(outrows):
TypeError: '<' not supported between instances of 'int' and 'str'
The command '/bin/sh -c bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"' returned a non-zero code: 2
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 9
- Comments: 25 (13 by maintainers)
Commits related to this issue
- Fix issue #5868: TypeError in move_wheel_files(). — committed to cjerdonek/pip by cjerdonek 6 years ago
- Fix #5868: TypeError in move_wheel_files(). (#5883) — committed to pypa/pip by cjerdonek 6 years ago
- Merge #1723 1723: Scheduled weekly dependency update for week 05 r=mythmon a=pyup-bot ### Update [atomicwrites](https://pypi.org/project/atomicwrites) from **1.2.1** to **1.3.0**. *The bot wa... — committed to mozilla/normandy by bors[bot] 5 years ago
- Scheduled monthly dependency update for February (#22) ### Update [pip](https://pypi.org/project/pip) from **18.1** to **19.0.1**. <details> <summary>Changelog</summary> #... — committed to vilkasgroup/Pakettikauppa by pyup-bot 5 years ago
Ah, I think I know why this happens for Poetry, but not anyone else. Python sorts tuples by looking at items one by one. If a pair is sortable, it stops looking. The first two items in each record entry are file name and hash, both guarenteed to be strings. In most situations, you only have one row for each file (for obvious reasons), so the sorting never gets to use the third item (length). The bad Poetry wheel, however, contains two additional entries:
poetry-0.11.5.dist-info/INSTALLER
and../../Scripts/poetry.exe
. They conflict with the two dynamically generated rows bymove_wheel_files()
[1], having the same file names and hashes, thus triggering Python to use the third argument, causing the exception.[1]: The latter only conflicts on Windows, of course.
I think the short-term solution would be for Poetry to somehow prevent adding those conflicting rows into
RECORDS
. But the wheel format specification does not seem to prohibit those entries from existing, so either the PEP needs amendment, or pip needs to deal with potential row conflicts. Or maybe both should be done, since I don’t really think the../../Scripts/poetry.exe
row should be valid anyway, for security reasons (I don’t think a wheel should be able to write outside its installation root).By the way—the bad Poetry wheel also contains a lot of pyc files (and entries for them in
RECORD
). It really shouldn’t.I posted a fix for this here: https://github.com/pypa/pip/pull/5883
Author of Poetry here!
I would like to point out that the bad wheel is generated by the custom installer that Poetry uses (https://github.com/sdispater/poetry/blob/master/get-poetry.py) which uses a non-standard (and may I say a bad) way to build a wheel that will be used to install Poetry for the current Python version. This has since been fixed on the
develop
branch with the implementation of a new installer (https://github.com/sdispater/poetry/pull/378) which will be released with the next0.12.0
version.Note that the wheels uploaded to PyPI are not affected and are proper, standard wheels.
Sorry about the confusion.