You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/10/04 17:22:20 UTC

[jira] [Assigned] (SPARK-17741) Grammar to parse top level and nested data fields separately

     [ https://issues.apache.org/jira/browse/SPARK-17741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-17741:
------------------------------------

    Assignee: Apache Spark

> Grammar to parse top level and nested data fields separately
> ------------------------------------------------------------
>
>                 Key: SPARK-17741
>                 URL: https://issues.apache.org/jira/browse/SPARK-17741
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Tejas Patil
>            Assignee: Apache Spark
>            Priority: Trivial
>
> Based on discussion over the dev list:
> {noformat}
> Is there any reason why Spark SQL supports "<column name>" ":" "<data type>" while specifying columns ?
> eg. sql("CREATE TABLE t1 (column1:INT)") works fine. 
> Here is relevant snippet in the grammar [0]:
> ```
> colType
>     : identifier ':'? dataType (COMMENT STRING)?
>     ;
> ```
> I do not see MySQL[1], Hive[2], Presto[3] and PostgreSQL [4] supporting ":" while specifying columns.
> They all use space as a delimiter.
> [0] : https://github.com/apache/spark/blob/master/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4#L596
> [1] : http://dev.mysql.com/doc/refman/5.7/en/create-table.html
> [2] : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
> [3] : https://prestodb.io/docs/current/sql/create-table.html
> [4] : https://www.postgresql.org/docs/9.1/static/sql-createtable.html
> {noformat}
> Herman's response:
> {noformat}
> This is because we use the same rule to parse top level and nested data fields. For example:
> create table tbl_x(
>   id bigint,
>   nested struct<col1:string,col2:string>
> )
> Shows both syntaxes. We should split this rule in a top-level and nested rule.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org