coveragepy: `get_python_source` gets called on shared object file and causes SyntaxError
Describe the bug
While testing with pytest-cov
we started noticing some failures due to an “internal” error. Digging deeper, it looks like somehow a shared object (.so) file has made its way to get_python_source
, which then fails with
INTERNALERROR> SyntaxError: invalid or missing encoding declaration
with the actual issue being
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR> File "/Users/gabriele.tornetta/.pyenv/versions/3.5.10/lib/python3.5/tokenize.py", line 392, in find_cookie
INTERNALERROR> line_string = line.decode('utf-8')
INTERNALERROR> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xcf in position 0: invalid continuation byte
Whilst it’s maybe the case that the issue is more upstream (maybe .so files should not make it this far?), perhaps get_python_source
should be enhanced to catch these potential cases. For instance, the extension check at https://github.com/nedbat/coveragepy/blob/c776c901aa9b3214be815ceee9f179f47bcd86d8/coverage/python.py#L42-L45 could be enhanced?
To Reproduce
Unfortunately, it’s not clear to me how to reproduce this problem, as it has appeared all of a sudden, and I don’t see any changes in the dependency versions that are currently being pulled in the CI jobs. The project that is being tested contains some Cythonized code, and this is where the shared object is coming from.
- What version of Python are you using? 3.5, 3.6, 3.8
- What version of coverage.py are you using? The output of
coverage debug sys
is helpful. 5.5 - What versions of what packages do you have installed? The output of
pip freeze
is helpful. - What code are you running? Give us a specific commit of a specific repo that we can check out.
- What commands did you run?
Expected behavior
No internal errors 🙂
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (10 by maintainers)
Commits related to this issue
- fix(python): handle source decoding errors As described in #1160, occasionally some shared object paths make their way into the coverage report. When they get to `get_python_source` the call to `sour... — committed to P403n1x87/coveragepy by P403n1x87 3 years ago
- fix: avoid measuring generated code. #1160 — committed to nedbat/coveragepy by nedbat 3 years ago
- py-coverage: update to 6.4.4. Version 6.4.4 — 2022-08-16 -------------------------- - Wheels are now provided for Python 3.11. .. _changes_6-4-3: Version 6.4.3 — 2022-08-06 ----------------------... — committed to NetBSD/pkgsrc by deleted user 2 years ago
I have a fix for this in commit afe6cf34.
The problem happened with Cython files containing other generated code (@P403n1x87 had attrs, @bdice had doctests).
Thanks for the bug! 😃
@nedbat Amazing! I really appreciate your insight and support on this, I wouldn’t have thought of that root cause. 😊