lark: grammar files get opened an unnecessary amount of times, causing an enormous loading time when creating a parser
it seems that, when using a FromPackageLoader
object, a grammar file is opened and read from each time another grammar uses a rule that is imported from that former grammar. this means opening the same file over and over again, for each occurence of a rule contained in that file.
while this may not be noticeable for parsers that only use grammar files contained in the same directory (meaning no custom FromPackageLoader
is necessary), it becomes highly problematic when using many FromPackageLoader
s, as the time required to construct a parser goes up by an absurd amount.
by placing a print(resource_name)
in the get_data()
function of the python lib pkgutil.py
, i was able to count how many times my grammar files were loaded each. for example, the common.lark
grammar provided by lark gets opened 61 (!) times, one of my own grammars 25 times, another 16, etc.
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 72 (72 by maintainers)
@ornariece I will create a PR, probably tomorrow. Now I gotta sleep. 😃