pip: PIP doesn't read setup.cfg in UTF-8, which causes UnicodeDecodeError

Environment

  • pip version: 20.2.3
  • Python version: 3.8.5
  • OS: Win7x64

Description

When installing anything within a path that has a setup.cfg, pip will read it but fails at decoding if the file contains non-ASCII characters. The cfg file is proper UTF-8 but PIP doesn’t open it as so. It tries to use “GBK” (my locale) and fails.

Related, but not the same issue: #8717 (this is caused by pyvenv.cfg)

Expected behavior

It should read setup.cfg in UTF-8.

How to Reproduce

  1. open a path.
  2. Create a setup.cfg file with
[metadata]
name = test
version = 0.0.1
packages =
    test
description = '計算ツール'
  1. Run pip install requests

Output

D:\test>pip install requests
ERROR: Exception:
Traceback (most recent call last):
  File "c:\program files\python3\lib\site-packages\pip\_internal\cli\base_command.py", line 228, in _main
    status = self.run(options, args)
  File "c:\program files\python3\lib\site-packages\pip\_internal\cli\req_command.py", line 182, in wrapper
    return func(self, options, args)
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 245, in run
    options.use_user_site = decide_user_install(
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 664, in decide_user_install
    if site_packages_writable(root=root_path, isolated=isolated_mode):
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 609, in site_packages_writab
le
    get_lib_location_guesses(root=root, isolated=isolated))
  File "c:\program files\python3\lib\site-packages\pip\_internal\commands\install.py", line 600, in get_lib_location_gue
sses
    scheme = distutils_scheme('', user=user, home=home, root=root,
  File "c:\program files\python3\lib\site-packages\pip\_internal\locations.py", line 109, in distutils_scheme
    d.parse_config_files()
  File "c:\program files\python3\lib\distutils\dist.py", line 406, in parse_config_files
    parser.read(filename)
  File "c:\program files\python3\lib\configparser.py", line 697, in read
    self._read(fp, filename)
  File "c:\program files\python3\lib\configparser.py", line 1017, in _read
    for lineno, line in enumerate(fp, start=1):
UnicodeDecodeError: 'gbk' codec can't decode byte 0xab in position 88: illegal multibyte sequence

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 23 (19 by maintainers)

Most upvoted comments

Man, they basically re-implemented the whole thing. I’m not sure pip should be responsible to maintaining this implementation.

Maybe we should just catch the parsing error and carry on pretending the file does not exist instead. pip does not actually need any of the data you mentioned above; it just reads distutils configuration. The usage is honestly quite niche, and I doubt many would even notice the difference.