yapf: A crash in Python 3.5: lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*'

This crash will only happen in Python 3.5.

The Test file

def func(iterable, *args, **kwargs):
    other(*iterable, *args, **kwargs)


def other(*args, **kwargs):
    print(args)
    print(kwargs)


func([1, 2], 'arg0', 'arg1', arg2=2, arg3=3)

Its runtime result is just as expected.

$ python3.5 test.py
(1, 2, 'arg0', 'arg1')
{'arg3': 3, 'arg2': 2}

Crash

$ yapf test.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/pytree_utils.py", line 115, in ParseCodeToTree
    tree = parser_driver.parse_string(code, debug=False)
  File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 106, in parse_string
    return self.parse_tokens(tokens, debug)
  File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 71, in parse_tokens
    if p.addtoken(type, value, (prefix, start)):
  File "/usr/lib/python3.5/lib2to3/pgen2/parse.py", line 159, in addtoken
    raise ParseError("bad input", type, value, context)
lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*', context=(' ', (2, 21))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/yapf", line 8, in <module>
    sys.exit(run_main())
  File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 344, in run_main
    sys.exit(main(sys.argv))
  File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 226, in main
    verbose=args.verbose)
  File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 278, in FormatFiles
    in_place, print_diff, verify, quiet, verbose)
  File "/usr/local/lib/python3.5/dist-packages/yapf/__init__.py", line 305, in _FormatFile
    logger=logging.warning)
  File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/yapf_api.py", line 91, in FormatFile
    verify=verify)
  File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/yapf_api.py", line 129, in FormatCode
    tree = pytree_utils.ParseCodeToTree(unformatted_source)
  File "/usr/local/lib/python3.5/dist-packages/yapf/yapflib/pytree_utils.py", line 121, in ParseCodeToTree
    tree = parser_driver.parse_string(code, debug=False)
  File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 106, in parse_string
    return self.parse_tokens(tokens, debug)
  File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 71, in parse_tokens
    if p.addtoken(type, value, (prefix, start)):
  File "/usr/lib/python3.5/lib2to3/pgen2/parse.py", line 159, in addtoken
    raise ParseError("bad input", type, value, context)
lib2to3.pgen2.parse.ParseError: bad input: type=16, value='*', context=(' ', (2, 21))

It will crash at other(*iterable, *args, **kwargs).

Environment

$ python3.5 --version
Python 3.5.2
$ yapf --version
yapf 0.29.0

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (12 by maintainers)

Most upvoted comments

It appears that asttokens could be used with the new PEG parser. I’ll try converting some of my code to use asttokens and see how it goes. Don’t expect a quick response … I’m going to be out of town for a while.

AFAICT black’s blib2to3 uses the same compiler technology that lib2to3 uses, and therefore won’t work in the future – the PEG parsers can handle things that the lib2to3 parsers can’t.

I’m going to contact one of the black developers about … hopefully, I’ll report back soon. (I need to read up on a few things first …)

It appears that the “black” formatter uses a slightly modified lib2to3, so it would likely have a similar problem. https://github.com/psf/black/tree/main/src/blib2to3

But it might be worthwhile tracking black’s version, as it could handle some things that have been reported in this thread.

Does parso provide access to the “white space” and comments in the source, and maps the tree to the source? I didn’t see anything about that in the API, but I might have missed it.

(I don’t think that libCST has this either)

In principle, it should be straightforward to wrap the existing Python parser (libary ast) to do what lib2to3 does for yapf. (and ast maps from the parse tree to the source; dealing with the whitespace should be straightforward) But it’s a moderate amount of work …

(There’s also leoAst.py, but it seems to be a rather complicated way of doing what should be a fairly simple thing)

On Thu, 1 Jul 2021 at 23:04, Neil Girdhar @.***> wrote:

@kamahen https://github.com/kamahen You are right though that if an alternate parser isn’t eventually used, yapf will unfortunately end up broken. Have you looked at parso https://parso.readthedocs.io/en/latest/? That’s what the Python docs recommend as a replacement.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/google/yapf/issues/825#issuecomment-872740766, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIIGNNMB5WXFJQG3OMLZ4N3TVVJHTANCNFSM4LLSTF2Q .

As I said a few days ago, it’s a significant amount of work to switch from lib2to3 parser to the new parser. I might work on it, one of these days, but don’t have an immediate need and there are more interesting things that I’d rather do first.

Lib2to3 is on its way to deprecation because future Pythons will use a different parsing technology.

I proposed doing some work to make a lib2to3-like interface to the new Python parser, but nobody seemed interested, and it’s a fair bit of work.

There are some alternative parsers, but they’d require some work to integrate into yapf, and it’s not clear how well they’ll work in future either. I’ve looked at another parser for “leo-editor” but it seems overly complicated for what it does … I might be persuaded to make a simpler version of that if enough people are interested.