pip: pip install: UnicodeDecodeError on Windows

pip install <package> fails on Windows, if the projects description (e.g, its long description) is in utf-8.

(simpy) C:\Users\sscherfke\Code\simpy>pip install .
Unpacking c:\users\sscherfke\code\simpy
  Running setup.py egg_info for package from file:///c%7C%5Cusers%5Csscherfke%5Ccode%5Csimpy

Cleaning up...
Exception:
Traceback (most recent call last):
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\basecommand.py", line 134, in main
    status = self.run(options, args)
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\commands\install.py", line 236, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\req.py", line 1134, in prepare_files
    req_to_install.run_egg_info()
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\req.py", line 264, in run_egg_info
    "%(Name)s==%(Version)s" % self.pkg_info())
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\req.py", line 357, in pkg_info
    data = self.egg_info_data('PKG-INFO')
  File "C:\Users\sscherfke\Envs\simpy\lib\site-packages\pip\req.py", line 297, in egg_info_data
    data = fp.read()
  File "C:\Users\sscherfke\Envs\simpy\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1235: character maps to <undefined>

Storing complete log in C:\Users\sscherfke\pip\pip.log

The problem seems to be, that req.egg_info_data() (currently line 317 reads the egg-info created by python setup.py egg_info with the system’s default encoding, which is not utf-8 on Windows (but on most *nix systems).

With Python 3, it should be no problem if you use utf-8 in your README/CHANGES/AUTHORS.txt (or whatever), so pip should read files as unicode by default:

Changing lines 296 and 297 (in pip 1.4.1; 316 and 317 in the repo) to

fp = open(filename, 'rb')
data = fp.read().decode('utf-8')

fixes the problem for me.

The test setups was:

  • Windows 7 64bit
  • Python 3.3.1
  • pip 1.4.1
  • setuptools 0.9.8

About this issue

  • Original URL
  • State: closed
  • Created 11 years ago
  • Comments: 26 (9 by maintainers)

Most upvoted comments

Not for me… Which lib should I have to upgrade i/o to get away from this problem?