You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by hvanhovell <gi...@git.apache.org> on 2017/02/22 23:13:15 UTC

[GitHub] spark pull request #17030: [SPARK-19459] Support for nested char/varchar fie...

GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/17030

    [SPARK-19459] Support for nested char/varchar fields in ORC

    ## What changes were proposed in this pull request?
    This PR is a small follow-up on https://github.com/apache/spark/pull/16804. This PR also adds support for nested char/varchar fields.
    
    ## How was this patch tested?
    I have added a regression test to the OrcSourceSuite.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark SPARK-19459-follow-up

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17030.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17030
    
----
commit e832ce68e1717cd5b8f2f8e25cf7b5e181abedaf
Author: Herman van Hovell <hv...@databricks.com>
Date:   2017-02-22T23:10:28Z

    Allow for nested char/varchar fields in ORC

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17030: [SPARK-19459] Support for nested char/varchar fie...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17030


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73301/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    **[Test build #73301 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73301/testReport)** for PR 17030 at commit [`e832ce6`](https://github.com/apache/spark/commit/e832ce68e1717cd5b8f2f8e25cf7b5e181abedaf).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    thanks, merging to master!
    
    it conflicts with branch 2.1, can you submit a new PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    **[Test build #73346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73346/testReport)** for PR 17030 at commit [`fa2e0cc`](https://github.com/apache/spark/commit/fa2e0cccee3143e082c6c1caeb9436c8cd6feea4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17030: [SPARK-19459] Support for nested char/varchar fie...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17030#discussion_r102617893
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1481,19 +1489,12 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging {
           builder.putString("comment", string(STRING))
         }
         // Add Hive type string to metadata.
    -    dataType match {
    -      case p: PrimitiveDataTypeContext =>
    -        p.identifier.getText.toLowerCase match {
    -          case "varchar" | "char" =>
    -            builder.putString(HIVE_TYPE_STRING, dataType.getText.toLowerCase)
    -          case _ =>
    -        }
    -      case _ =>
    -    }
    +    val rawDataType = typedVisit[DataType](ctx.dataType)
    +    builder.putString(HIVE_TYPE_STRING, rawDataType.catalogString)
    --- End diff --
    
    how about we rename `HIVE_TYPE_STRING` to `ORIGINAL_TYPE_STRING`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    **[Test build #73346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73346/testReport)** for PR 17030 at commit [`fa2e0cc`](https://github.com/apache/spark/commit/fa2e0cccee3143e082c6c1caeb9436c8cd6feea4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17030: [SPARK-19459] Support for nested char/varchar fie...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17030#discussion_r102618263
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ---
    @@ -1481,19 +1489,12 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with Logging {
           builder.putString("comment", string(STRING))
         }
         // Add Hive type string to metadata.
    -    dataType match {
    -      case p: PrimitiveDataTypeContext =>
    -        p.identifier.getText.toLowerCase match {
    -          case "varchar" | "char" =>
    -            builder.putString(HIVE_TYPE_STRING, dataType.getText.toLowerCase)
    -          case _ =>
    -        }
    -      case _ =>
    -    }
    +    val rawDataType = typedVisit[DataType](ctx.dataType)
    +    builder.putString(HIVE_TYPE_STRING, rawDataType.catalogString)
    --- End diff --
    
    the tests failed because we always put the hive type string. We should only do it when there is char/varchar type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    cc @cloud-fan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73346/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17030: [SPARK-19459] Support for nested char/varchar fields in ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17030
  
    **[Test build #73301 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73301/testReport)** for PR 17030 at commit [`e832ce6`](https://github.com/apache/spark/commit/e832ce68e1717cd5b8f2f8e25cf7b5e181abedaf).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `sealed abstract class HiveStringType extends AtomicType `
      * `case class CharType(length: Int) extends HiveStringType `
      * `case class VarcharType(length: Int) extends HiveStringType `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org