antlr4: Compiling seperate lexer and parser in subdirectory fails

If one defines a seperate Parser and Lexer as such:

Lexer.g4:

lexer grammar Lexer;

tokens {INDENT, DEDENT}

INT     : [0-9]+ ;

Parser.g4:

parser grammar Parser;

options { tokenVocab=Lexer; }

main    
        : INT* EOF
        ;

With the following directory layout:

-project
    -src
        -Lexer.g4
        -Parser.g4
    -build
    -antlr4.2.2-complete.jar

Compiling from the project directory with the following command: java -jar antlr-4.2.2-complete.jar -o build src/*.g4

Fails with: error(3): cannot find tokens file 'build/Lexer.tokens'

Lexer.tokens is correctly generated, but is in build/src/Lexer.tokens as expected.

This works fine if the grammar files are in the current directory when compiling them (ie one runs java -jar ../antlr-4.2.2-complete.jar -o ../build *.g4 or if an extra -lib build/src/ option is used.

It seems this shouldn’t be required and antlr should know where to find the .tokens file it generates.

UPDATE:

Here are some additional details about the proposed feature:

For the following, assume this directory layout:

-project
    -src
        -foo
            -lexer.g4
            -metarParser.g4
            -tafParser.g4
        -bar
            -notamParser.g4
    -build

With the following dependency between the grammars:

lexer.g4:
metarParser.g4: lexer.g4
notamParser.g4: lexer.g4
tafParser.g4: notamParser.g4

In all the following cases it is assumed that all of the grammar files are compiled in a single invocation of antlr, and that we want the generated files to be in the build directory (optionally in sub folders).

1st case: All grammars in working directory.

  • We wish to compile metarParser.g4

  • Our working directory is project/src/foo/

  • Run command: antlr *.g4 -o …/…/build/

    Observed behavior:

    • Antlr compiles lexer.g4
    • lexer.g4 has a relative path of ., the generated token file is therefore in build/
    • Antlr adds the …/…/build/ directory to it’s “lib” search path
    • Antlr compiles meterParser.g4
    • Antlr searches the “lib” directories for the required lexer.tokens
    • Antlr correctly finds the lexer.tokens file in …/…/build/

    Desired behavior:

    • Same.

2nd case: All grammars are in a given subdirectory.

  • We wish to compile metarParser.g4

  • Our working directory is project/

  • Run command: antlr src/foo/*.g4 -o build/

    Observed behavior:

    • Antlr compiles lexer.g4
    • lexer.g4 has a relative path of src/foo/, the generated token file is therefore in build/src/foo/
    • Antlr adds the build/ directory to it’s “lib” search path
    • Antlr compiles metarParser.g4
    • Antlr searches the “lib” directories for the required lexer.tokens
    • Antlr failes to find lexer.tokens as build/src/foo/ is not it’s search path

    Desired behavior:

    • Antlr compiles lexer.g4
    • lexer.g4 has a relative path of src/foo/, the generated token file is therefore in build/src/foo/
    • Antlr adds the build/src/foor directory to it’s “lib” search path
    • Antlr compiles metarParser.g4
    • Antlr searches the “lib” directories for the required lexer.tokens
    • Antlr correctly finds the lexer.tokens file in build/src/foo/

3rd case: All grammars are in (potentially) different subdirectories.

  • We wish to compile tafParser.g4

  • Our working directory is project/

  • Run command: antlr src/foo/*.g4 src/bar/tafParser.g4 -o build/

    Observed behavior:

    • Antlr compiles lexer.g4
    • lexer.g4 has a relative path of src/foo/, the generated token file is therefore in build/src/foo/
    • Antlr adds the build/ directory to it’s “lib” search path
    • Antlr compiles notamParser.g4
    • Antlr searches the “lib” directories for the required lexer.tokens
    • Antlr failes to find lexer.tokens as build/src/foo/ is not it’s search path

    Desired behavior:

    • Antlr compiles lexer.g4
    • lexer.g4 has a relative path of src/foo/, the generated token file is therefore in build/src/foo/
    • Antlr adds the build/src/foo/ directory to it’s “lib” search path
    • Antlr compiles notamParser.g4
    • notamParser.g4 has a relative path of src/bar/, the generated token file is therefore in build/src/bar/
    • Antlr adds the build/src/bar/ directory to it’s “lib” search path
    • Antlr searches the “lib” directories for the required lexer.token
    • Antlr correctly finds the lexer.tokens file in build/src/foo/
    • Antlr compiles tafParser.g4
    • Antlr searches the “lib” directories for the required notamParser.tokens
    • Antlr correctly finds the notamParser.tokens file in build/src/bar/

For the last two cases, all that is required is to keep track of the actual output directory (ie the directory specified with -o switch or the working directory, with the correct subdirectory) for every file, and to search those.

About this issue

  • Original URL
  • State: closed
  • Created 10 years ago
  • Comments: 21 (18 by maintainers)

Commits related to this issue

Most upvoted comments

I’d like to increase priority for this problem, so that it finally gets solved. I’m also hit by it and I wonder why this is still open given that a simple

java -ja <path>xxx.jar test/TLexer.g4 test/TParser.g4

call fails already (error: error(160): TParser.g4:4:14: cannot find tokens file ./TLexer.tokens). It also fails when an output path is specified. To me it looks fundamentally broken! A working search strategy could be this:

  1. Search .tokens files where the grammar is, if no output path is specified. This is also the folder(s) where the generated files end up in this case.
  2. If an output path is given then look there.
  3. If file not found try the lib folder (as it is now).

Sounds simple, right? But this only works if ANTLR would stop creating weird paths (e.g. by automatically combining output and grammar subpaths). Make the output path imperative. It’s the ultimate target where to look. If a package-like folder structure is required one can easily construct the correct output path before invoking ANTLR.