scipy: SyntaxError: Non-ASCII character '\xe2' in file scipy/stats/_continuous_distns.py on line 3346, but no encoding declared

I get syntax errors due to source code encodings when importing scipy 1.2.0. The error only seems to occur if pyc files are missing.

The issue originates from https://github.com/scipy/scipy/blob/v1.2.0/scipy/stats/_continuous_distns.py#L3345. The dash/hyphen used there isn’t a regular dash/hyphen, but instead the line reads as ' pp. 1\xe2\x80\x9313, 1997.\n'

Reproducing code example:

FROM python:2.7

RUN pip install --no-cache-dir scipy

# Without the following line, I can't reproduce the problem.
RUN find /usr/local/lib/python2.7/site-packages/scipy/ -name '*.pyc' -delete

RUN python -c "import scipy.stats"

Error message:

Step 1/4 : FROM python:2.7
 ---> f67e752245d6
Step 2/4 : RUN pip install --no-cache-dir scipy
 ---> Using cache
 ---> e8adb88b0f15
Step 3/4 : RUN find /usr/local/lib/python2.7/site-packages/scipy/ -name '*.pyc' -delete
 ---> Running in 6e46e320f24d
Removing intermediate container 6e46e320f24d
 ---> ed5be7afd25a
Step 4/4 : RUN python -c "import scipy.stats"
 ---> Running in 8c8ea36e3284
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/scipy/stats/__init__.py", line 367, in <module>
    from .stats import *
  File "/usr/local/lib/python2.7/site-packages/scipy/stats/stats.py", line 173, in <module>
    from . import distributions
  File "/usr/local/lib/python2.7/site-packages/scipy/stats/distributions.py", line 13, in <module>
    from . import _continuous_distns
  File "/usr/local/lib/python2.7/site-packages/scipy/stats/_continuous_distns.py", line 3345
SyntaxError: Non-ASCII character '\xe2' in file /usr/local/lib/python2.7/site-packages/scipy/stats/_continuous_distns.py on line 3346, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Scipy/Numpy/Python version information:

import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
('1.2.0', '1.15.4', sys.version_info(major=2, minor=7, micro=15, releaselevel='final', serial=0))

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 8
  • Comments: 17 (8 by maintainers)

Most upvoted comments

Use grep or some other bash util to get a list of py files with non-ascii chars; get a list of files with # -*- coding: utf-8 -*- at the top; take the first list minus the second and you have the problematic ones? This or something similar should be bash-able on one of the Travis builds. I can take a stab at it if nobody sees a cleaner way.

If the error shows up at import time for users, can we trigger it on CI?

I suspect it will only happen on 2.7, which we don’t test now. (We should probably still fix it, though.) I wrote a little BASH solution, will open a PR to see the output.

SyntaxError: Non-ASCII character ‘\xc3’ in file /usr/local/lib/python2.7/dist-packages/scipy/stats/_continuous_distns.py on line 3383, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

python -c " with open(‘/usr/local/lib/python2.7/dist-packages/scipy/stats/_continuous_distns.py’) as fp: for i, line in enumerate(fp): if ‘\xc3’ in line: print i, repr(line) " 3381 ’ This distribution is also known as the Fr\xc3\xa9chet distribution or the\n’

I fixed previous errors though, and this one appeared, line number might be different though (-1 or so)

and also in this file SyntaxError: Non-ASCII character ‘\xe2’ in file /usr/local/lib/python2.7/dist-packages/scipy/stats/_stats_mstats_common.py on line 331, but no encoding…