You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2016/07/05 07:21:39 UTC

[GitHub] spark pull request #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRe...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/14053

    [SPARK-16374] [SQL] Remove Alias from MetastoreRelation and SimpleCatalogRelation

    #### What changes were proposed in this pull request?
    Different from the other leaf nodes, `MetastoreRelation` and `SimpleCatalogRelation` have a pre-defined `alias`, which is used to change the qualifier of the node. However, based on the existing alias handling, alias should be put in `SubqueryAlias`. 
    
    This PR is to separate alias handling from `MetastoreRelation` and `SimpleCatalogRelation` to make it consistent with the other nodes. It simplifies the signature and conversion to a `BaseRelation`.
    
    For example, below is an example query for `MetastoreRelation`,  which is converted to a `LogicalRelation`:
    ```SQL
    SELECT tmp.a + 1 FROM test_parquet_ctas tmp WHERE tmp.a > 2
    ```
    
    Before changes, the analyzed plan is
    ```
    == Analyzed Logical Plan ==
    (a + 1): int
    Project [(a#951 + 1) AS (a + 1)#952]
    +- Filter (a#951 > 2)
       +- SubqueryAlias tmp
          +- Relation[a#951] parquet
    ```
    After changes, the analyzed plan becomes
    ```
    == Analyzed Logical Plan ==
    (a + 1): int
    Project [(a#951 + 1) AS (a + 1)#952]
    +- Filter (a#951 > 2)
       +- SubqueryAlias tmp
          +- SubqueryAlias test_parquet_ctas
             +- Relation[a#951] parquet
    ```
    
    **Note: the optimized plans are the same.**
    
    For `SimpleCatalogRelation`, the existing code always generates two Subqueries. Thus, no change is needed.
    
    #### How was this patch tested?
    Added test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark removeAliasFromMetastoreRelation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14053.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14053
    
----
commit f753a6314d234fba9b313de37b1c3708e3a0f784
Author: gatorsmile <ga...@gmail.com>
Date:   2016-07-04T22:06:56Z

    Merge remote-tracking branch 'upstream/master' into removeAliasFromMetastoreRelation

commit fcd4940ce4c7b3d3ea59ada0c0cc0931a8c0ac23
Author: gatorsmile <ga...@gmail.com>
Date:   2016-07-05T00:34:11Z

    remove alias

commit 003966e9318a003e2fc57f9f5f42513607fdf7d6
Author: gatorsmile <ga...@gmail.com>
Date:   2016-07-05T00:46:02Z

    added a test case

commit f09056c093f847b85fe1194c71a0f9f5f9bf3685
Author: gatorsmile <ga...@gmail.com>
Date:   2016-07-05T07:16:25Z

    remove alias from SimpleCatalogRelation

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    cc @cloud-fan Do you think this removal is OK? Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61747/consoleFull)** for PR 14053 at commit [`f09056c`](https://github.com/apache/spark/commit/f09056c093f847b85fe1194c71a0f9f5f9bf3685).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61824 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61824/consoleFull)** for PR 14053 at commit [`d54b605`](https://github.com/apache/spark/commit/d54b605bc7ef6ba4d1ed14bfcc65fef075bb5e63).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61824/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14053#discussion_r69673940
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
    @@ -716,6 +717,39 @@ class ParquetSourceSuite extends ParquetPartitioningTest {
         }
       }
     
    +  test("Alias used in converted data source tables") {
    --- End diff --
    
    Yeah, I can remove it. 
    
    You are right. That example is just for showing the outcome of the code changes. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRe...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14053#discussion_r69516779
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
    @@ -375,7 +377,7 @@ private[hive] class HiveMetastoreCatalog(sparkSession: SparkSession) extends Log
             // Read path
             case relation: MetastoreRelation if shouldConvertMetastoreParquet(relation) =>
               val parquetRelation = convertToParquetRelation(relation)
    -          SubqueryAlias(relation.alias.getOrElse(relation.tableName), parquetRelation)
    +          SubqueryAlias(relation.tableName, parquetRelation)
    --- End diff --
    
    The added test case is for verifying the code changes here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61820/consoleFull)** for PR 14053 at commit [`d54b605`](https://github.com/apache/spark/commit/d54b605bc7ef6ba4d1ed14bfcc65fef075bb5e63).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61820/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61747/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    thanks, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRe...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/14053


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61820/consoleFull)** for PR 14053 at commit [`d54b605`](https://github.com/apache/spark/commit/d54b605bc7ef6ba4d1ed14bfcc65fef075bb5e63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRe...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14053#discussion_r69672781
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala ---
    @@ -716,6 +717,39 @@ class ParquetSourceSuite extends ParquetPartitioningTest {
         }
       }
     
    +  test("Alias used in converted data source tables") {
    --- End diff --
    
    Do we really need a test for it? This PR looks like a cleanup to me, and making one `Subquery` into two is not our target, making the code cleaner is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61747/consoleFull)** for PR 14053 at commit [`f09056c`](https://github.com/apache/spark/commit/f09056c093f847b85fe1194c71a0f9f5f9bf3685).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #14053: [SPARK-16374] [SQL] Remove Alias from MetastoreRelation ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/14053
  
    **[Test build #61824 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61824/consoleFull)** for PR 14053 at commit [`d54b605`](https://github.com/apache/spark/commit/d54b605bc7ef6ba4d1ed14bfcc65fef075bb5e63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org