You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (Jira)" <ji...@apache.org> on 2022/04/05 15:40:00 UTC

[jira] [Resolved] (SPARK-38384) Improve error messages of ParseException from ANTLR

     [ https://issues.apache.org/jira/browse/SPARK-38384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li resolved SPARK-38384.
-----------------------------
    Fix Version/s: 3.3.0
         Assignee: Xinyi Yu
       Resolution: Fixed

> Improve error messages of ParseException from ANTLR
> ---------------------------------------------------
>
>                 Key: SPARK-38384
>                 URL: https://issues.apache.org/jira/browse/SPARK-38384
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Xinyi Yu
>            Assignee: Xinyi Yu
>            Priority: Major
>             Fix For: 3.3.0
>
>
> This task is intended to improve the error messages of ParseException directly coming from ANTLR.
> h2. Bad Error Messages
> Many error messages defined in ANTLR are not user-friendly. For example,
> {code:java}
> spark.sql("sel 1")
>  
> ParseException: 
> mismatched input 'sel' expecting {'(', 'APPLY', 'CONVERT', 'COPY', 'OPTIMIZE', 'RESTORE', 'ADD', 'ALTER', 'ANALYZE', 'CACHE', 'CLEAR', 'COMMENT', 'COMMIT', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DFS', 'DROP', 'EXPLAIN', 'EXPORT', 'FROM', 'GRANT', 'IMPORT', 'INSERT', 'LIST', 'LOAD', 'LOCK', 'MAP', 'MERGE', 'MSCK', 'REDUCE', 'REFRESH', 'REPLACE', 'RESET', 'REVOKE', 'ROLLBACK', 'SELECT', 'SET', 'SHOW', 'START', 'SYNC', 'TABLE', 'TRUNCATE', 'UNCACHE', 'UNLOCK', 'UPDATE', 'USE', 'VALUES', 'WITH'}(line 1, pos 0)
>  
> == SQL ==
> sel 1
> ^^^ {code}
> Following the [Spark Error Message Guidelines|https://spark.apache.org/error-message-guidelines.html], the words in this message are vague and hard to follow. It states ‘What’, but is unclear on the ‘Why’ and ‘How’.
> Or,
> {code:java}
> spark.sql("") // empty query
> ParseException: 
> mismatched input '<EOF>' expecting {'(', 'CONVERT', 'COPY', 'OPTIMIZE', 'RESTORE', 'ADD', 'ALTER', 'ANALYZE', 'CACHE', 'CLEAR', 'COMMENT', 'COMMIT', 'CREATE', 'DELETE', 'DESC', 'DESCRIBE', 'DFS', 'DROP', 'EXPLAIN', 'EXPORT', 'FROM', 'GRANT', 'IMPORT', 'INSERT', 'LIST', 'LOAD', 'LOCK', 'MAP', 'MERGE', 'MSCK', 'REDUCE', 'REFRESH', 'REPLACE', 'RESET', 'REVOKE', 'ROLLBACK', 'SELECT', 'SET', 'SHOW', 'START', 'TABLE', 'TRUNCATE', 'UNCACHE', 'UNLOCK', 'UPDATE', 'USE', 'VALUES', 'WITH'}(line 1, pos 0)
> == SQL ==
> ^^^ {code}
> Instead of simply telling users it’s an empty line, it outputs a long message, even giving the jargon '<EOF>'.
> h2. Where do these error messages come from?
> There has been much work on improving ParseException in general (see [https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] for example). But lots of the above error messages are defined in ANTLR and stay unmodified in Spark.
> When such an error is encountered in ANTLR, ANTLR notified the exception listener with a message like ‘mismatched input {} expecting {}’. The Spark exception listener _appends_ the line and position to the message, as well as the problematic SQL and several ‘^^^’ marking the error position. Then it throws a ParseException with the appended error message. Spark doesn’t modify the error message given from ANTLR. 
> This task focuses on those error messages from ANTLR.
> h2. Goals
>  # Improve the error messages of ParseException that are from ANTLR; Modify all affected test cases accordingly.
>  # Make sure the new error message framework is applied in this change.
> h2. Proposed Error Messages Change
> It should be in each sub-task and includes concrete before & after cases. See the description of each sub-task for more details.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org