pip: Sorting TypeError in move_wheel_files() during install (e.g. Poetry)

Environment

  • pip version: Latest
  • Python version: 3.6.2
  • OS: Fedora 25

Description

I’m getting the exception:

TypeError: '<' not supported between instances of 'int' and 'str'

when attempting to install poetry (example dockerfile and full stacktrace below). Looking at the code for pip, in move_wheel_files it calls sorted(outrows) which is sorting a tuple. The 3rd column for that tuple looks like it could be an int or string, which is a bug:

>>> sorted((('','',''),('','',1)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'int' and 'str'

so the code in pip thats relevant:

            outrows = []
            for row in reader:
                row[0] = installed.pop(row[0], row[0])
                if row[0] in changed:
                    row[1], row[2] = rehash(row[0])
                outrows.append(tuple(row))
            for f in generated:
                digest, length = rehash(f)
                outrows.append((normpath(f, lib_dir), digest, length))
            for f in installed:
                outrows.append((installed[f], '', ''))
            for row in sorted(outrows):
                writer.writerow(row)

as you can see, the for f in installed will always place a string in the 3rd column of the tuple, however, the paths that use rehash put length in the 3rd column which looking at the code for rehash, will always be an integer:

def rehash(path, blocksize=1 << 20):
    """Return (hash, length) for path using hashlib.sha256()"""
    h = hashlib.sha256()
    length = 0
    with open(path, 'rb') as f:
        for block in read_chunks(f, size=blocksize):
            length += len(block)
            h.update(block)
    digest = 'sha256=' + urlsafe_b64encode(
        h.digest()
    ).decode('latin1').rstrip('=')
    return (digest, length)

Expected behavior

Not error out on sorting

How to Reproduce

Example dockerfile that reproduces the issue:

FROM fedora:25

RUN dnf -y update
RUN dnf -y install python36
RUN bash -c "curl https://bootstrap.pypa.io/get-pip.py | python3.6"
RUN bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"

Output

$ docker build -t poetry_fail .
Sending build context to Docker daemon    554kB
Step 1/5 : FROM fedora:25
 ---> 9cffd21a45e3
Step 2/5 : RUN dnf -y update
 ---> Using cache
 ---> 9afadb458128
Step 3/5 : RUN dnf -y install python36
 ---> Using cache
 ---> 7291182779ff
Step 4/5 : RUN bash -c "curl https://bootstrap.pypa.io/get-pip.py | python3.6"
 ---> Using cache
 ---> debf1591f40a
Step 5/5 : RUN bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"
 ---> Running in 04bb671e7e1e
Retrieving metadata

Installing version: 0.11.5
  - Getting dependencies
  - Vendorizing dependencies
  - Installing poetry
An error has occured: Command '('/usr/bin/python3.6', '-m', 'pip', 'install', '--upgrade', '--no-deps', '/tmp/poetry-installer-t434oaug/poetry-0.11.5-py2.py3-none-any.whl')' returned non-zero exit status 2.
Processing /tmp/poetry-installer-t434oaug/poetry-0.11.5-py2.py3-none-any.whl
Installing collected packages: poetry
Exception:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 143, in main
    status = self.run(options, args)
  File "/usr/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 366, in run
    use_user_site=options.use_user_site,
  File "/usr/lib/python3.6/site-packages/pip/_internal/req/__init__.py", line 49, in install_given_reqs
    **kwargs
  File "/usr/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 760, in install
    use_user_site=use_user_site, pycompile=pycompile,
  File "/usr/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 382, in move_wheel_files
    warn_script_location=warn_script_location,
  File "/usr/lib/python3.6/site-packages/pip/_internal/wheel.py", line 514, in move_wheel_files
    for row in sorted(outrows):
TypeError: '<' not supported between instances of 'int' and 'str'

The command '/bin/sh -c bash -c "curl -sSL https://raw.githubusercontent.com/sdispater/poetry/master/get-poetry.py | python3.6"' returned a non-zero code: 2

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 9
  • Comments: 25 (13 by maintainers)

Commits related to this issue

Most upvoted comments

Ah, I think I know why this happens for Poetry, but not anyone else. Python sorts tuples by looking at items one by one. If a pair is sortable, it stops looking. The first two items in each record entry are file name and hash, both guarenteed to be strings. In most situations, you only have one row for each file (for obvious reasons), so the sorting never gets to use the third item (length). The bad Poetry wheel, however, contains two additional entries: poetry-0.11.5.dist-info/INSTALLER and ../../Scripts/poetry.exe. They conflict with the two dynamically generated rows by move_wheel_files()[1], having the same file names and hashes, thus triggering Python to use the third argument, causing the exception.

[1]: The latter only conflicts on Windows, of course.

I think the short-term solution would be for Poetry to somehow prevent adding those conflicting rows into RECORDS. But the wheel format specification does not seem to prohibit those entries from existing, so either the PEP needs amendment, or pip needs to deal with potential row conflicts. Or maybe both should be done, since I don’t really think the ../../Scripts/poetry.exe row should be valid anyway, for security reasons (I don’t think a wheel should be able to write outside its installation root).

By the way—the bad Poetry wheel also contains a lot of pyc files (and entries for them in RECORD). It really shouldn’t.

I posted a fix for this here: https://github.com/pypa/pip/pull/5883

Author of Poetry here!

I would like to point out that the bad wheel is generated by the custom installer that Poetry uses (https://github.com/sdispater/poetry/blob/master/get-poetry.py) which uses a non-standard (and may I say a bad) way to build a wheel that will be used to install Poetry for the current Python version. This has since been fixed on the develop branch with the implementation of a new installer (https://github.com/sdispater/poetry/pull/378) which will be released with the next 0.12.0 version.

Note that the wheels uploaded to PyPI are not affected and are proper, standard wheels.

Sorry about the confusion.