You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dilip Biswal (JIRA)" <ji...@apache.org> on 2019/03/16 02:29:00 UTC

[jira] [Commented] (SPARK-27170) Better error message for syntax error with extraneous comma in the SQL parser

    [ https://issues.apache.org/jira/browse/SPARK-27170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794096#comment-16794096 ] 

Dilip Biswal commented on SPARK-27170:
--------------------------------------

[~wyukawa]

Unfortunately in Spark, we turn off ansi mode parsing by default. That means that we are very lenient on list of acceptable identifiers. Today its possible for us to create a table with column name 'distinct' i.e
{code:sql}
create table foo (distinct int, c2 int, c3, int)
{code}
So in this mode, the query you are trying is a valid query i.e
{code:sql}
 select distinct, c2, c3 from foo limit 3
{code}
and the error you are seeing is probably justified which tells us that the table we are
 querying does not have `distinct` as a column.

Recently [~maropu] has added support for ansi mode parsing. If we turn this on we get
 parser error similar (not exact) to what you would expect to see.
{code:sql}
spark-sql> create table foobar(distinct int, c2 int);
Error in query: 
no viable alternative at input 'distinct'(line 1, pos 20)

== SQL ==
create table foobar(distinct int, c2 int)
--------------------^^^
{code}
{code:sql}
spark-sql> select distinct, c1 from foo;
Error in query: 
no viable alternative at input 'distinct'(line 1, pos 7)

== SQL ==
select distinct, c1 from foo
-------^^^
{code}
So i am not sure if we can do anything in terms of error reporting in the default
 parsing mode (i.e when ansi parsing mode is off).

cc'ing [~smilegator] [~maropu] [~cloud_fan] in case they have any further insights into this.

> Better error message for syntax error with extraneous comma in the SQL parser
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-27170
>                 URL: https://issues.apache.org/jira/browse/SPARK-27170
>             Project: Spark
>          Issue Type: Wish
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Wataru Yukawa
>            Priority: Minor
>
> [~maropu], [~smilegator]
> It was great to talk with you in Hadoop / Spark Conference Japan 2019.
> Thanks in advance!
> I filed this issue which I talked with you at that time.
> We sometimes write a syntax error SQL with extraneous comma by mistake.
> For example, here is the SQL with an extraneous comma in line 2.
> {code}
> SELECT distinct
> ,a
> ,b
> ,c
> FROM ...' LIMIT 100
> {code}
> We have an error message in spark 2.4.0 but it's a little hard to understand in my feeling because line number is wrong.
> {code}
> cannot resolve '`distinct`' given input columns: [...]; line 1 pos 7;
> 'GlobalLimit 100
> +- 'LocalLimit 100
> +- 'Project ['distinct, ...]
> +- Filter (...)
> +- SubqueryAlias ...
> +- HiveTableRelation ...
> {code}
> By the way, here is the error message in prestosql 305 and same sql.
> Line number is correct and I guess an error message is better than sparksql.
> {code}
> line 2:5: mismatched input ','. Expecting: '*', <expression>, <identifier>
> {code}
> If sparksql error message improves, it would be great.
> Thanks.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org