lark: Question (possibly bug?)
Hello Erez,
I’m having trouble coming up with the correct rule when parsing the sudo grammar. Here’s my grammar:
?sudo_item : (alias | user_spec)*
alias : "User_Alias" user_alias (":" user_alias)*
| "Runas_Alias" runas_alias (":" runas_alias)*
| "Host_Alias" host_alias (":" host_alias)*
| "Cmnd_Alias" cmnd_alias (":" cmnd_alias)*
user_alias : ALIAS_NAME "=" user_list
host_alias : ALIAS_NAME "=" host_list
runas_alias : ALIAS_NAME "=" runas_list
cmnd_alias : ALIAS_NAME "=" cmnd_list
user_spec : user_list host_list "=" cmnd_spec_list (":" host_list "=" cmnd_spec_list)*
cmnd_spec_list : cmnd_spec ("," cmnd_spec)*
cmnd_spec : runas_spec? tag_spec* command
runas_spec : "(" runas_list? (":" runas_list)? ")"
command : "!"* command_name | "!"* cmnd_alias
command_name : COMMAND
file_name : /[-_.a-z0-9A-Z\/]\/]+/
host_list : host ("," host)*
user_list : user ("," user)*
runas_list : user ("," user)*
cmnd_list : command ("," command)*
host : HOST_NAME
user : "!"* USER_NAME
| "!"* "%" GROUP_NAME
| "!"* "#" UID
| "!"* "%#" GID
| "!"* "+" NETGROUP_NAME
| "!"* "%:" NONUNIX_GROUP_NAME
| "!"* "%:#" NONUNIX_GID
| "!"* user_alias
tag_spec : (tag_nopwd
| tag_pwd
| tag_noexec
| tag_exec
| tag_setenv
| tag_nosetenv
| tag_log_output
| tag_nolog_output) ":"
tag_pwd : "PASSWD"
tag_nopwd : "NOPASSWD"
tag_exec : "EXEC"
tag_noexec : "NOEXEC"
tag_setenv : "SETENV"
tag_nosetenv : "NOSETENV"
tag_log_output : "LOG_OUTPUT"
tag_nolog_output : "NOLOG_OUTPUT"
UID : /[0-9]+/
GID : /[0-9]+/
NONUNIX_GID : /[0-9]+/
USER_NAME : /[-_.a-z0-9A-Z]+/
GROUP_NAME : CNAME
NETGROUP_NAME : CNAME
NONUNIX_GROUP_NAME : CNAME
ALIAS_NAME : CNAME
HOST_NAME : /[-_.a-z0-9A-Z\[\]*]+/
COMMAND : /[^,:\n]+/
%import common.CNAME
%import common.WS
%ignore /[\\\\]/
%ignore WS
A sudo rule can look like this:
DBA ALL = (oracle) ALL, !SU
or like this:
DBA ALL = (oracle) ALL, !SU : ALL = (postgres) ALL, !SU
The grammar handles the first case fine. It has trouble parsing the second variant. It boils down to the COMMAND token, which is anything following the runas bit ((postgres), (oracle), etc) and should include anything but [:,\n]. That regex doesn’t seem to work.
Also, sudo lines may have the \ line continuation at the end. Is %ignore /[\\\\]/ the correct way to handle that situation? It does seem to operate correctly.
Last question, when i specify parser='lalr', I get this:
MY INPUT: DBA ALL = (oracle) ALL
ERROR:
lark.common.UnexpectedToken: Unexpected token Token(COMMAND, ' ALL = (oracle) ALL') at line 1, column 3.
Does that mean my grammar is not LALR compatible?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 20 (9 by maintainers)
Regarding line and column for the default parser, I haven’t implemented it yet, but I know it’s a crucial feature. Feel free to open a new issue for it if you like, and I’ll try to get to it soon.
You should use the latest master branch for now, instead of the latest pypi release. I plan to make a release soon, once I see all the recent breaking changes are stable.