You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/03/06 04:36:42 UTC

[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/17171

    [SPARK-19830] [SQL] Add parseTableSchema API to ParserInterface

    ### What changes were proposed in this pull request?
    
    Specifying the table schema in DDL formats is needed for different scenarios. For example, 
    - [specifying the schema in SQL function `from_json` using DDL formats](https://issues.apache.org/jira/browse/SPARK-19637), which is suggested by @marmbrus , 
    - [specifying the customized JDBC data types](https://github.com/apache/spark/pull/16209). 
    
    These two PRs need users to use the JSON format to specify the table schema. This is not user friendly. 
    
    This PR is to provide a `parseTableSchema` API in `ParserInterface`. 
    
    ### How was this patch tested?
    Added a test suite `TableSchemaParserSuite`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark parseDDLStmt

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17171.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17171
    
----
commit 50f74d26ec1bc37c5f5bea054da60a1910778e46
Author: Xiao Li <ga...@gmail.com>
Date:   2017-03-06T04:18:20Z

    add parseTableSchema API

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74390 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73965/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74635 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74635/testReport)** for PR 17171 at commit [`b18ae84`](https://github.com/apache/spark/commit/b18ae84c1f0485d929e58d217c1881d037721881).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    cc @hvanhovell @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74052 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74052/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r104367747
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1476,7 +1476,7 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging {
       /**
        * Create top level table schema.
        */
    -  protected def createSchema(ctx: ColTypeListContext): StructType = {
    +  def createSchema(ctx: ColTypeListContext): StructType = {
    --- End diff --
    
    We need to be a little bit careful here. I think we may need to add a rule similar to `singleDataType` to the grammar and expose its implementation. This to prevent us from silently dropping trailing input, or having a weird error on trailing input; it seems we get a somewhat vague error ATM...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r106155001
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableSchemaParserSuite.scala ---
    @@ -0,0 +1,88 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to You under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +*    http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +
    +package org.apache.spark.sql.catalyst.parser
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.types._
    +
    +class TableSchemaParserSuite extends SparkFunSuite {
    +
    +  def parse(sql: String): StructType = CatalystSqlParser.parseTableSchema(sql)
    +
    +  def checkTableSchema(tableSchemaString: String, expectedDataType: DataType): Unit = {
    +    test(s"parse ${tableSchemaString.replace("\n", "")}") {
    +      assert(parse(tableSchemaString) === expectedDataType)
    +    }
    +  }
    +
    +  def intercept(sql: String): Unit =
    --- End diff --
    
    I'd like to name it `assertError`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74635/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r104556453
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParseDriver.scala ---
    @@ -49,6 +49,14 @@ abstract class AbstractSqlParser extends ParserInterface with Logging {
         astBuilder.visitSingleTableIdentifier(parser.singleTableIdentifier())
       }
     
    +  /**
    +   * Creates StructType for a given SQL string, which is a comma separated list of field
    +   * definitions which will preserve the correct Hive metadata.
    +   */
    +  override def parseTableSchema(sqlText: String): StructType = parse(sqlText) { parser =>
    +    StructType(astBuilder.visitColTypeList(parser.colTypeList()))
    --- End diff --
    
    @hvanhovell How about this? I am not sure whether I address your concern. Feel free to let me know if you have any concern


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74048 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74048/testReport)** for PR 17171 at commit [`3ec8483`](https://github.com/apache/spark/commit/3ec8483d8d7e8ba4be88041ebba73a2bda922186).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74048/testReport)** for PR 17171 at commit [`3ec8483`](https://github.com/apache/spark/commit/3ec8483d8d7e8ba4be88041ebba73a2bda922186).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74390 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74390/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r104366919
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/ParserInterface.scala ---
    @@ -33,4 +34,7 @@ trait ParserInterface {
     
       /** Creates TableIdentifier for a given SQL string. */
       def parseTableIdentifier(sqlText: String): TableIdentifier
    +
    +  /** Creates StructType for a given SQL string. */
    +  def parseTableSchema(sqlText: String): StructType
    --- End diff --
    
    So the difference between the `parseDataType` and the `parseTableSchema` functions is that the latter allows you to parse a comma separated list of field definitions which will preserve the correct Hive metadata? Could you document this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74052 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74052/testReport)** for PR 17171 at commit [`22b7db8`](https://github.com/apache/spark/commit/22b7db8bc013d5dcd23c3ef0f45483c47ea66b98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74052/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #74635 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/74635/testReport)** for PR 17171 at commit [`b18ae84`](https://github.com/apache/spark/commit/b18ae84c1f0485d929e58d217c1881d037721881).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    ping @hvanhovell @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r104356142
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableSchemaParserSuite.scala ---
    @@ -0,0 +1,85 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to You under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +*    http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +
    +package org.apache.spark.sql.catalyst.parser
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.types._
    +
    +class TableSchemaParserSuite extends SparkFunSuite {
    +
    +  def parse(sql: String): DataType = CatalystSqlParser.parseTableSchema(sql)
    +
    +  def checkTableSchema(tableSchemaString: String, expectedDataType: DataType): Unit = {
    +    test(s"parse ${tableSchemaString.replace("\n", "")}") {
    +      assert(parse(tableSchemaString) === expectedDataType)
    +    }
    +  }
    +
    +  checkTableSchema("a int", (new StructType).add("a", "int"))
    --- End diff --
    
    nit: `new StructType().add...`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #73965 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73965/testReport)** for PR 17171 at commit [`50f74d2`](https://github.com/apache/spark/commit/50f74d26ec1bc37c5f5bea054da60a1910778e46).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74390/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/74048/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17171: [SPARK-19830] [SQL] Add parseTableSchema API to ParserIn...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17171
  
    **[Test build #73965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73965/testReport)** for PR 17171 at commit [`50f74d2`](https://github.com/apache/spark/commit/50f74d26ec1bc37c5f5bea054da60a1910778e46).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r106154837
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableSchemaParserSuite.scala ---
    @@ -0,0 +1,88 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to You under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +*    http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +
    +package org.apache.spark.sql.catalyst.parser
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.types._
    +
    +class TableSchemaParserSuite extends SparkFunSuite {
    +
    +  def parse(sql: String): StructType = CatalystSqlParser.parseTableSchema(sql)
    +
    +  def checkTableSchema(tableSchemaString: String, expectedDataType: DataType): Unit = {
    +    test(s"parse ${tableSchemaString.replace("\n", "")}") {
    --- End diff --
    
    seems `tableSchemaString` will never contains \n?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17171


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17171: [SPARK-19830] [SQL] Add parseTableSchema API to P...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17171#discussion_r104365901
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableSchemaParserSuite.scala ---
    @@ -0,0 +1,85 @@
    +/*
    +* Licensed to the Apache Software Foundation (ASF) under one or more
    +* contributor license agreements.  See the NOTICE file distributed with
    +* this work for additional information regarding copyright ownership.
    +* The ASF licenses this file to You under the Apache License, Version 2.0
    +* (the "License"); you may not use this file except in compliance with
    +* the License.  You may obtain a copy of the License at
    +*
    +*    http://www.apache.org/licenses/LICENSE-2.0
    +*
    +* Unless required by applicable law or agreed to in writing, software
    +* distributed under the License is distributed on an "AS IS" BASIS,
    +* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +* See the License for the specific language governing permissions and
    +* limitations under the License.
    +*/
    +
    +package org.apache.spark.sql.catalyst.parser
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.types._
    +
    +class TableSchemaParserSuite extends SparkFunSuite {
    +
    +  def parse(sql: String): DataType = CatalystSqlParser.parseTableSchema(sql)
    +
    +  def checkTableSchema(tableSchemaString: String, expectedDataType: DataType): Unit = {
    +    test(s"parse ${tableSchemaString.replace("\n", "")}") {
    +      assert(parse(tableSchemaString) === expectedDataType)
    +    }
    +  }
    +
    +  checkTableSchema("a int", (new StructType).add("a", "int"))
    +  checkTableSchema("A int", (new StructType).add("A", "int"))
    +  checkTableSchema("a INT", (new StructType).add("a", "int"))
    +  checkTableSchema("`!@#$%.^&*()` string", (new StructType).add("!@#$%.^&*()", "string"))
    +  checkTableSchema("a int, b long", (new StructType).add("a", "int").add("b", "long"))
    +  checkTableSchema("a STRUCT<intType: int, ts:timestamp>",
    +    StructType(
    +      StructField("a", StructType(
    +        StructField("intType", IntegerType) ::
    +        StructField("ts", TimestampType) :: Nil)) :: Nil))
    +
    +  checkTableSchema(
    +    "a int comment 'test'",
    +    (new StructType).add("a", "int", nullable = true, "test"))
    +
    +  test("complex hive type") {
    +    val tableSchemaString =
    +      """
    +        |complexStructCol struct<
    +        |struct:struct<deciMal:DECimal, anotherDecimal:decimAL(5,2)>,
    +        |MAP:Map<timestamp, varchar(10)>,
    +        |arrAy:Array<double>,
    +        |anotherArray:Array<char(9)>>
    +      """.stripMargin.replace("\n", "")
    +
    +    val builder = new MetadataBuilder
    +    builder.putString(HIVE_TYPE_STRING,
    +      "struct<struct:struct<deciMal:decimal(10,0),anotherDecimal:decimal(5,2)>," +
    +        "MAP:map<timestamp,varchar(10)>,arrAy:array<double>,anotherArray:array<char(9)>>")
    +
    +    val expectedDataType =
    +      StructType(
    +        StructField("complexStructCol", StructType(
    +          StructField("struct",
    +            StructType(
    +              StructField("deciMal", DecimalType.USER_DEFAULT) ::
    +                StructField("anotherDecimal", DecimalType(5, 2)) :: Nil)) ::
    +            StructField("MAP", MapType(TimestampType, StringType)) ::
    +            StructField("arrAy", ArrayType(DoubleType)) ::
    +            StructField("anotherArray", ArrayType(StringType)) :: Nil),
    +          nullable = true,
    +          builder.build()) :: Nil)
    +
    +    assert(parse(tableSchemaString) === expectedDataType)
    +  }
    +
    +  test("illegal col types") {
    +    val e = intercept[ParseException] {
    +      CatalystSqlParser.parseTableSchema("a INT b long")
    +    }.getMessage
    +    assert(e.contains("mismatched input 'b' expecting {<EOF>, '(', ',', 'COMMENT'}"))
    --- End diff --
    
    I am not sure we should check this. This is an ANTLR error message.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org