yapf: A crash in Python 3.5: lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*'
This crash will only happen in Python 3.5.
The Test file
def func(iterable, *args, **kwargs):
other(*iterable, *args, **kwargs)
def other(*args, **kwargs):
print(args)
print(kwargs)
func([1, 2], 'arg0', 'arg1', arg2=2, arg3=3)
Its runtime result is just as expected.
$ python3.5 test.py
(1, 2, 'arg0', 'arg1')
{'arg3': 3, 'arg2': 2}
Crash
$ yapf test.py
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/pytree_utils.py", line 115, in ParseCodeToTree
tree = parser_driver.parse_string(code, debug=False)
File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 106, in parse_string
return self.parse_tokens(tokens, debug)
File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 71, in parse_tokens
if p.addtoken(type, value, (prefix, start)):
File "/usr/lib/python3.5/lib2to3/pgen2/parse.py", line 159, in addtoken
raise ParseError("bad input", type, value, context)
lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*', context=(' ', (2, 21))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/yapf", line 8, in <module>
sys.exit(run_main())
File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 344, in run_main
sys.exit(main(sys.argv))
File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 226, in main
verbose=args.verbose)
File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 278, in FormatFiles
in_place, print_diff, verify, quiet, verbose)
File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 305, in _FormatFile
logger=logging.warning)
File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/yapf_api.py", line 91, in FormatFile
verify=verify)
File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/yapf_api.py", line 129, in FormatCode
tree = pytree_utils.ParseCodeToTree(unformatted_source)
File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/pytree_utils.py", line 121, in ParseCodeToTree
tree = parser_driver.parse_string(code, debug=False)
File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 106, in parse_string
return self.parse_tokens(tokens, debug)
File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 71, in parse_tokens
if p.addtoken(type, value, (prefix, start)):
File "/usr/lib/python3.5/lib2to3/pgen2/parse.py", line 159, in addtoken
raise ParseError("bad input", type, value, context)
lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*', context=(' ', (2, 21))
It will crash at other(*iterable, *args, **kwargs).
Environment
$ python3.5 --version
Python 3.5.2
$ yapf --version
yapf 0.29.0
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 22 (12 by maintainers)
It appears that asttokens could be used with the new PEG parser. I’ll try converting some of my code to use asttokens and see how it goes. Don’t expect a quick response … I’m going to be out of town for a while.
AFAICT black’s
blib2to3uses the same compiler technology thatlib2to3uses, and therefore won’t work in the future – the PEG parsers can handle things that thelib2to3parsers can’t.I’m going to contact one of the black developers about … hopefully, I’ll report back soon. (I need to read up on a few things first …)
It appears that the “black” formatter uses a slightly modified lib2to3, so it would likely have a similar problem. https://github.com/psf/black/tree/main/src/blib2to3
But it might be worthwhile tracking black’s version, as it could handle some things that have been reported in this thread.
Does parso provide access to the “white space” and comments in the source, and maps the tree to the source? I didn’t see anything about that in the API, but I might have missed it.
(I don’t think that libCST has this either)
In principle, it should be straightforward to wrap the existing Python parser (libary ast) to do what lib2to3 does for yapf. (and ast maps from the parse tree to the source; dealing with the whitespace should be straightforward) But it’s a moderate amount of work …
(There’s also leoAst.py, but it seems to be a rather complicated way of doing what should be a fairly simple thing)
On Thu, 1 Jul 2021 at 23:04, Neil Girdhar @.***> wrote:
As I said a few days ago, it’s a significant amount of work to switch from lib2to3 parser to the new parser. I might work on it, one of these days, but don’t have an immediate need and there are more interesting things that I’d rather do first.
Lib2to3 is on its way to deprecation because future Pythons will use a different parsing technology.
I proposed doing some work to make a lib2to3-like interface to the new Python parser, but nobody seemed interested, and it’s a fair bit of work.
There are some alternative parsers, but they’d require some work to integrate into yapf, and it’s not clear how well they’ll work in future either. I’ve looked at another parser for “leo-editor” but it seems overly complicated for what it does … I might be persuaded to make a simpler version of that if enough people are interested.