pip: UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 117: ordinal not in range(128)

Description

I really don’t know what’s wrong. But this starts happening after 21.2.1 release (https://github.com/aminvakil/docker-ansible/runs/3153704426?check_suite_focus=true) and worked just fine until 21.1.3 release (https://github.com/aminvakil/docker-ansible/runs/3148228248?check_suite_focus=true).

This happens on pip3 install ansible command system-wide. This is running on (centos8, centos7, debian10, debian11, ubuntu20.04, ubuntu18.04, fedora34, fedora33, fedora32) and only broke in centos7 and ubuntu18.04 (Please kindly see https://github.com/aminvakil/docker-ansible/runs/3148228248?check_suite_focus=true).


Current workaround: Ubuntu 18.04: https://github.com/pypa/pip/issues/10219#issuecomment-887337037 CentOS 7: https://github.com/pypa/pip/issues/10219#issuecomment-888127061


Expected behavior

Ansible gets installed.

pip version

21.2.1

Python version

3.6.7

OS

Ubuntu 18.04

How to Reproduce

Try building this dockerfile:

FROM ubuntu:bionic

LABEL maintainer="Amin Vakil <info@aminvakil.com>"

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
       python3-setuptools \
       python3-pip \
       python3-wheel \
       systemd \
       sudo \
    && rm -Rf /var/lib/apt/lists/* \
    && rm -Rf /usr/share/doc && rm -Rf /usr/share/man \
    && apt-get clean

RUN pip3 install --upgrade pip && pip3 install ansible

ENTRYPOINT ["/bin/systemd"]

or

FROM centos:7

LABEL maintainer="Amin Vakil <info@aminvakil.com>"

ENV container docker
RUN (cd /lib/systemd/system/sysinit.target.wants/; for i in *; do [ $i == \
systemd-tmpfiles-setup.service ] || rm -f $i; done); \
rm -f /lib/systemd/system/multi-user.target.wants/*;\
rm -f /etc/systemd/system/*.wants/*;\
rm -f /lib/systemd/system/local-fs.target.wants/*; \
rm -f /lib/systemd/system/sockets.target.wants/*udev*; \
rm -f /lib/systemd/system/sockets.target.wants/*initctl*; \
rm -f /lib/systemd/system/basic.target.wants/*;\
rm -f /lib/systemd/system/anaconda.target.wants/*;

RUN yum -y install python3-pip sudo && yum clean all

RUN pip3 install --upgrade pip && pip3 install ansible

VOLUME ["/sys/fs/cgroup"]

CMD ["/usr/sbin/init"]

Output

CentOS 7:

Step 6/9 : RUN pip3 install git+https://github.com/uranusjr/pip@locations-linux-system
 ---> Running in d8cf176851cb
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting git+https://github.com/uranusjr/pip@locations-linux-system
  Cloning https://github.com/uranusjr/pip (to locations-linux-system) to /tmp/pip-zfmsy5d7-build
Installing collected packages: pip
  Running setup.py install for pip: started
    Running setup.py install for pip: finished with status 'done'
Successfully installed pip-21.3.dev0
Removing intermediate container d8cf176851cb
 ---> b135aa678bda
Step 7/9 : RUN pip3 install ansible
 ---> Running in 9ce8bf831db5
Collecting ansible
  Downloading ansible-4.3.0.tar.gz (35.1 MB)
Collecting ansible-core<2.12,>=2.11.3
  Downloading ansible-core-2.11.3.tar.gz (6.8 MB)
ERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/cli/req_command.py", line 203, in wrapper
    return func(self, options, args)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 316, in run
    reqs, check_supported_wheels=not options.target_dir
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 95, in resolve
    collected.requirements, max_rounds=try_to_avoid_resolution_too_deep
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 472, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 366, in resolve
    failure_causes = self._attempt_to_pin_criterion(name)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 212, in _attempt_to_pin_criterion
    criteria = self._get_updated_criteria(candidate)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 203, in _get_updated_criteria
    self._add_to_criteria(criteria, requirement, parent=candidate)
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
    if not criterion.candidates:
  File "/usr/local/lib/python3.6/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
    return bool(self._sequence)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 140, in __bool__
    return any(self)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 128, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 32, in _iter_built
    candidate = func()
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 209, in _make_candidate_from_link
    version=version,
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 301, in __init__
    version=version,
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 156, in __init__
    self.dist = self._prepare()
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 227, in _prepare
    dist = self._prepare_distribution()
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 306, in _prepare_distribution
    self._ireq, parallel_builds=True
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py", line 508, in prepare_linked_requirement
    return self._prepare_linked_requirement(req, parallel_builds)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py", line 552, in _prepare_linked_requirement
    self.download_dir, hashes
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py", line 249, in unpack_url
    unpack_file(file.path, location, file.content_type)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/utils/unpacking.py", line 256, in unpack_file
    untar_file(filename, location)
  File "/usr/local/lib/python3.6/site-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 117: ordinal not in range(128)
The command '/bin/sh -c pip3 install ansible' returned a non-zero code: 2

Ubuntu 18.04:

Step 4/6 : RUN pip3 install git+https://github.com/uranusjr/pip@locations-linux-system
 ---> Running in 2e8524091c5a
Collecting git+https://github.com/uranusjr/pip@locations-linux-system
  Cloning https://github.com/uranusjr/pip (to locations-linux-system) to /tmp/pip-o235zpbf-build
Installing collected packages: pip
  Found existing installation: pip 9.0.1
    Not uninstalling pip at /usr/lib/python3/dist-packages, outside environment /usr
  Running setup.py install for pip: started
    Running setup.py install for pip: finished with status 'done'
Successfully installed pip-21.3.dev0
Removing intermediate container 2e8524091c5a
 ---> 2b5d0d0702ea
Step 5/6 : RUN pip3 install ansible
 ---> Running in 4419ef47a4eb
Collecting ansible
  Downloading ansible-4.3.0.tar.gz (35.1 MB)
Collecting ansible-core<2.12,>=2.11.3
  Downloading ansible-core-2.11.3.tar.gz (6.8 MB)
ERROR: Exception:
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/base_command.py", line 173, in _main
    status = self.run(options, args)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/cli/req_command.py", line 203, in wrapper
    return func(self, options, args)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/commands/install.py", line 316, in run
    reqs, check_supported_wheels=not options.target_dir
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 95, in resolve
    collected.requirements, max_rounds=try_to_avoid_resolution_too_deep
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 472, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 366, in resolve
    failure_causes = self._attempt_to_pin_criterion(name)
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 212, in _attempt_to_pin_criterion
    criteria = self._get_updated_criteria(candidate)
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 203, in _get_updated_criteria
    self._add_to_criteria(criteria, requirement, parent=candidate)
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
    if not criterion.candidates:
  File "/usr/local/lib/python3.6/dist-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
    return bool(self._sequence)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 140, in __bool__
    return any(self)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 128, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 32, in _iter_built
    candidate = func()
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/factory.py", line 209, in _make_candidate_from_link
    version=version,
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 301, in __init__
    version=version,
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 156, in __init__
    self.dist = self._prepare()
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 227, in _prepare
    dist = self._prepare_distribution()
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/resolution/resolvelib/candidates.py", line 306, in _prepare_distribution
    self._ireq, parallel_builds=True
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 508, in prepare_linked_requirement
    return self._prepare_linked_requirement(req, parallel_builds)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 552, in _prepare_linked_requirement
    self.download_dir, hashes
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/operations/prepare.py", line 249, in unpack_url
    unpack_file(file.path, location, file.content_type)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/utils/unpacking.py", line 256, in unpack_file
    untar_file(filename, location)
  File "/usr/local/lib/python3.6/dist-packages/pip/_internal/utils/unpacking.py", line 226, in untar_file
    with open(path, "wb") as destfp:
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 117: ordinal not in range(128)
The command '/bin/sh -c pip3 install ansible' returned a non-zero code: 2

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 7
  • Comments: 31 (17 by maintainers)

Commits related to this issue

Most upvoted comments

I can confirm adding

ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8

fixes Ubuntu 18 Dockerfile.

For CentOS 7 this worked: https://github.com/pypa/pip/issues/10219#issuecomment-888127061

@hswong3i @rayfordj Please do not mention issues in commit messages, especially from someone else’s repository (or if you do, at least please don’t rebase it). Everyone in this thread gets notified whenever you add a commit (see the long list of back reference above?) and it is extremely annoying.

fixes Ubuntu 18 Dockerfile, but not CentOS 7 Dockerfile

Is there a way of fixing this in a centos 7 dockerfile?

ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8

The thing is, open is a built-in function and it accepting a str is a fundamental operation, so if anyone should be smart enough to set the encoding to utf-8, it’s Python. but it’s not, and I assume there’re good reasons for that.

IMO the best pip can do is to catch this exception and show a more friendly error message suggesting those environment configuration. In a Docker container, setting LC environment variables is likely the best way to go (this is by no means specific to pip or even Python, the default C and POSIX locales in containers trips a lot of people everywhere [1][2][3][4]), while on some other platforms PYTHONIOENCODING could be better.

No, I don’t think so. I thought the consensus was that it should be fixed in upstream Python, and indeed it is for Python 3.7+. And as pip is likely to drop support for 3.6 in a few months anyway, the workaround of setting LANG/LC_ALL is sufficient in the interim.

Submit a PR if you want, but don’t be surprised if it doesn’t get much interest…

Is anyone already working on a patch for this? Asking as I am considering doing it myself. Too many broken CI jobs caused by it and I think that fixing it in pip would benefit everyone.

Few notes so far:

  • Bug is easier to reproduce with ansible-core package which is far smaller
  • Unable to reproduce the bug on macos at all
  • Interestingly even on platform where bug does reproduce, sys.getdefaultencoding() still reports utf-8, which makes the bug even more “interesting”.
  • Altering only PYTHONIOENCODING does not work
  • Altering only LANG does not work ei
  • Altering only LC_ALL=en_US.UTF-8 works but the more generic LC_ALL=C.UTF-8 does not work at least on CentOS 7 where shell complains with bash: warning: setlocale: LC_ALL: cannot change locale (C.UTF-8)
  • Changing locale to en_US will almost for sure negatively impact systems that use other locales, and C approach seems not possible on all platforms.
  • Altering locale settings on systems does introduce serious regression risks which can affect how other programs function in ways that does not make it directly visible. Yep, you could end-up with a very high price to pay for a simple hack made for pip.
  • Requiring all py36 users to alter LC_ALL/LANG is totally disregarding the reality and the scale of the problem. For simple users changing this is no real problem but what about all the projects that make use of deployed systems with python, system images, docker containers, virtual environments, other tools that call pip in isolation like tox? To overcome this regression we put a huge burden on the entire pip users community.
  • This bug alone could be used by system packagers as a good example why they do not want to allow users to upgrade pip or why they delay repackaging newer versions of pip on their platforms. I used to “blame” them for keeping ancient versions of pip on their distributions but looking at this, maybe they have a good excuse.

A long pip still reports py36 as supported in its metadata, it must ensure it works with it. We all know py36 will not be fixed but we are aware that this bug is caused by a critical regression introduced by 21.2.0 release.

I would say that this regression is so big that it should have being a good reason to yank the entire 21.2.x line until extra tests are put in place. If pip would have being tested with installing enough packages it would have being caught before the release. If there is no desire from pip maintainer to offer support for py36 even if python itself reaches EOL in 4 months from now, the 21.2.x release should have being using python_requires>='3.7' instead.

Now going back to some constructive measures, I am bit confused by https://github.com/pypa/pip/blob/main/src/pip/_internal/utils/unpacking.py#L218 – Where does the ascii decoding problem comes from because I see the file open as binary, not text. What is really the place where something is tried to be encoded as ascii?

On CentOS 7 this worked for me: localedef -c -f UTF-8 -i en_US en_US.UTF-8 export LC_ALL=en_US.UTF-8

For Ubuntu 14.04, I had to do the following in order to install ansible on py3.6:

sudo locale-gen en_US.UTF-8
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

tox now passes this through from within https://pypi.org/project/tox/3.24.2/

fixes Ubuntu 18 Dockerfile, but not CentOS 7 Dockerfile

Is there a way of fixing this in a centos 7 dockerfile?

You can also try upgrading to Python 3.9.0 - worked for me.

That’s interesting, thanks for the additional information. I was wondering whether there’s a PEP on that (but missed PEP 538 and only found one for Windows; didn’t realised it has already been implemented either).

I guess we could do something to backport the modern encoding handling logic in pip, but given that Python 3.6 is goign EOL this year and persumbly pip will be dropping it in 22.0 anyway, the benefit seems limited for pip maintainers. So I’m inclined to leave this for downstream distributors to worry, and wait for them to upstream efforts if they are inclined to 🙂