You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Zhen Li (Jira)" <ji...@apache.org> on 2022/03/01 20:03:00 UTC

[jira] [Created] (SPARK-38378) ANTLR grammar definition in separate Parser and Lexer files

Zhen Li created SPARK-38378:
-------------------------------

             Summary: ANTLR grammar definition in separate Parser and Lexer files
                 Key: SPARK-38378
                 URL: https://issues.apache.org/jira/browse/SPARK-38378
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.3.0, 3.2.2
            Reporter: Zhen Li


Suggesting to separate the ANTLR grammar defined in `SqlBase.g4` into separate parser `SqlBaseParser.g4` and lexer `SqlBaseLexer.g4`. 

Benefits:

*Gain more flexibility when implementing new SQL features*

The current ANTLR grammar definition is given as a mixed grammar in the `SqlBase.g4` file.

By separating the lexer and parser, we will be able to use the full power of ANTLR parser and lexer grammars. e.g. lexer mode. This will give us more flexibility when implementing new SQL features.

*The code is more clean.* 

Having parser and lexer in different files also keeps the code more explicit about which is the parser and which is the lexer.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org