You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/03/30 17:23:28 UTC

[GitHub] spark pull request #17484: [SPARK-20160] [SQL] Move ParquetConversions and O...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/17484

    [SPARK-20160] [SQL] Move ParquetConversions and OrcConversions Out Of HiveSessionCatalog

    ### What changes were proposed in this pull request?
    `ParquetConversions` and `OrcConversions` should be treated as regular `Analyzer` rules. It is not reasonable to be part of `HiveSessionCatalog`.
    
    After moving these two rules out of HiveSessionCatalog, the next step is to clean up, rename and move `HiveMetastoreCatalog` because it is not related to the hive package any more.
    
    ### How was this patch tested?
    The existing test cases

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17484.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17484
    
----
commit 32edccce852d18e1fcfa122e39040d002e2a994b
Author: Xiao Li <ga...@gmail.com>
Date:   2017-03-30T17:10:18Z

    cleanup

commit 17b4be48172f78cd6dc908f1ab751cf2697aa55c
Author: Xiao Li <ga...@gmail.com>
Date:   2017-03-30T17:12:02Z

    Merge remote-tracking branch 'upstream/master' into cleanup

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75401/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    **[Test build #75401 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75401/testReport)** for PR 17484 at commit [`a9a388e`](https://github.com/apache/spark/commit/a9a388e42afc4965229b0dea5d8af4c0eba852b1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17484: [SPARK-20160] [SQL] Move ParquetConversions and O...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17484#discussion_r109009831
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -170,6 +171,79 @@ object HiveAnalysis extends Rule[LogicalPlan] {
       }
     }
     
    +/**
    + * When scanning or writing to non-partitioned Metastore Parquet tables, convert them to Parquet
    + * data source relations for better performance.
    + */
    +case class ParquetConversions(sparkSession: SparkSession) extends Rule[LogicalPlan] {
    --- End diff --
    
    It might also be worth out while to create a common ancestor for these two classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/75397/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17484: [SPARK-20160] [SQL] Move ParquetConversions and O...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/17484


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    cc @cloud-fan @hvanhovell 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    **[Test build #75397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75397/testReport)** for PR 17484 at commit [`17b4be4`](https://github.com/apache/spark/commit/17b4be48172f78cd6dc908f1ab751cf2697aa55c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17484: [SPARK-20160] [SQL] Move ParquetConversions and O...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17484#discussion_r109009445
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -170,6 +171,79 @@ object HiveAnalysis extends Rule[LogicalPlan] {
       }
     }
     
    +/**
    + * When scanning or writing to non-partitioned Metastore Parquet tables, convert them to Parquet
    + * data source relations for better performance.
    + */
    +case class ParquetConversions(sparkSession: SparkSession) extends Rule[LogicalPlan] {
    --- End diff --
    
    How about we pass the `conf` & the `HiveSessionCatalog` as parameters? This should be doable with the builder, and saves you from doing casting later-on


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #17484: [SPARK-20160] [SQL] Move ParquetConversions and O...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17484#discussion_r109017923
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -170,6 +171,79 @@ object HiveAnalysis extends Rule[LogicalPlan] {
       }
     }
     
    +/**
    + * When scanning or writing to non-partitioned Metastore Parquet tables, convert them to Parquet
    + * data source relations for better performance.
    + */
    +case class ParquetConversions(sparkSession: SparkSession) extends Rule[LogicalPlan] {
    --- End diff --
    
    Great! Let me combine them. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    **[Test build #75401 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75401/testReport)** for PR 17484 at commit [`a9a388e`](https://github.com/apache/spark/commit/a9a388e42afc4965229b0dea5d8af4c0eba852b1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class RelationConversions(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    **[Test build #75397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/75397/testReport)** for PR 17484 at commit [`17b4be4`](https://github.com/apache/spark/commit/17b4be48172f78cd6dc908f1ab751cf2697aa55c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #17484: [SPARK-20160] [SQL] Move ParquetConversions and OrcConve...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/17484
  
    LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org