You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2016/03/09 05:24:37 UTC

[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/11599

    [SPARK-13763] [SQL] Remove Project when its projectList is Empty

    #### What changes were proposed in this pull request?
    
    As shown in another PR: https://github.com/apache/spark/pull/11596, we are using `SELECT 1` as a dummy table, when the table is used for SQL statements in which a table reference is required, but the contents of the table are not important. For example,
    
    ```SQL
    SELECT value FROM (select 1) dummyTable Lateral View explode(array(1,2,3)) adTable as value
    ```
    Before the PR, the optimized plan contains a useless `Project` after Optimizer executing `ColumnPruning` rule, as shown below:
    
    ```
    == Analyzed Logical Plan ==
    value: int
    Project [value#22]
    +- Generate explode(array(1, 2, 3)), true, false, Some(adtable), [value#22]
       +- SubqueryAlias dummyTable
          +- Project [1 AS 1#21]
             +- OneRowRelation$
    
    == Optimized Logical Plan ==
    Generate explode([1,2,3]), false, false, Some(adtable), [value#22]
    +- Project
       +- OneRowRelation$
    ```
    
    After the fix, the optimized plan removed the useless `Project`, as shown below:
    ```
    == Optimized Logical Plan ==
    Generate explode([1,2,3]), false, false, Some(adtable), [value#22]
    +- OneRowRelation$
    ```
    
    #### How was this patch tested?
    
    Added a new unit test case into the suite `ColumnPruningSuite.scala`

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark projectOneRowRelation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11599.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11599
    
----
commit 0fa21acd2614fbdf128561b34c72088008d62a05
Author: gatorsmile <ga...@gmail.com>
Date:   2016-03-09T04:06:39Z

    remove Project with an empty projectList

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55480958
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    But a `Project` with empty projectList also has no output right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55485741
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    how about we just move that case ahead? It seems always safe to apply `case p @ Project(projectList, child) if sameOutput(child.output, p.output) => child`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194115203
  
    **[Test build #52724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52724/consoleFull)** for PR 11599 at commit [`a31b1b5`](https://github.com/apache/spark/commit/a31b1b588949f2f92981f7d1a7d04d6e1806ccd1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55486995
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    I thought we intentionally did it in this way. I am not 100% sure if we might hit any issue because of it. Let me try it and check if we will hit any test case failure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55482347
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Let me respond the original question by @cloud-fan 
    We will not see an empty Project, if the child has more than one columns. The empty Project only happens after columnPruning. I am fine, if we want to add an extra rule for eliminating Project only. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55471810
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ColumnPruningSuite.scala ---
    @@ -157,6 +157,14 @@ class ColumnPruningSuite extends PlanTest {
         comparePlans(Optimize.execute(query), expected)
       }
     
    +  test("Eliminate the Project with an empty projectList") {
    +    val input = OneRowRelation
    +    val query =
    +      Project(Literal(1).as("1") :: Nil, Project(Literal(1).as("1") :: Nil, input)).analyze
    --- End diff --
    
    Where do you test empty projectList?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its Ch...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194333608
  
    Done. The title and PR description are corrected. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194140806
  
    **[Test build #52724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52724/consoleFull)** for PR 11599 at commit [`a31b1b5`](https://github.com/apache/spark/commit/a31b1b588949f2f92981f7d1a7d04d6e1806ccd1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194141046
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481341
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    How about this?
    ```SQL
    case p @ Project(_, l: LeafNode) if ! l.isInstanceOf[OneRowRelation]  => p 
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its Ch...

Posted by marmbrus <gi...@git.apache.org>.
Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194437578
  
    Thanks, merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481375
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Actually, I just try to run this tests without this patch. It passes. @gatorsmile Can you verify it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194126222
  
    **[Test build #52721 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52721/consoleFull)** for PR 11599 at commit [`0fa21ac`](https://github.com/apache/spark/commit/0fa21acd2614fbdf128561b34c72088008d62a05).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194126377
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194229038
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194108308
  
    **[Test build #52721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52721/consoleFull)** for PR 11599 at commit [`0fa21ac`](https://github.com/apache/spark/commit/0fa21acd2614fbdf128561b34c72088008d62a05).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194227986
  
    **[Test build #52740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52740/consoleFull)** for PR 11599 at commit [`68decd1`](https://github.com/apache/spark/commit/68decd1729eb7023dc1c24efa2e8fbef7011f698).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194302585
  
    nit: we need to update the title and description. Technically we can't remove `Project` with empty projectList, only when the child also output `Nil`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55472012
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ColumnPruningSuite.scala ---
    @@ -157,6 +157,14 @@ class ColumnPruningSuite extends PlanTest {
         comparePlans(Optimize.execute(query), expected)
       }
     
    +  test("Eliminate the Project with an empty projectList") {
    +    val input = OneRowRelation
    +    val query =
    +      Project(Literal(1).as("1") :: Nil, Project(Literal(1).as("1") :: Nil, input)).analyze
    --- End diff --
    
    When running `Optimize.execute(query)`, the second Project's `projectList` is pruned to empty at first. Then, the second Project will be removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194229047
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52740/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55480611
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Because `OneRowRelation` has no output. So its output is different to its parent `Project`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194190169
  
    **[Test build #52740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52740/consoleFull)** for PR 11599 at commit [`68decd1`](https://github.com/apache/spark/commit/68decd1729eb7023dc1c24efa2e8fbef7011f698).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55480282
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    I'm thinking of the correctness of this rule. Actually this is not column pruning, but add more columns, as `child` may have more one columns.
    
    And why this rule `case p @ Project(projectList, child) if sameOutput(child.output, p.output) => child` can't work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its Ch...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/11599


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481539
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Nevermind. It is because another rule I added.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194143635
  
    @cloud-fan Added another two cases. Feel free to let me know if you want me to add more cases. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481255
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    ```SQL
        case p @ Project(_, l: LeafNode) => p 
    ```
    
    There is another case above it. Thus, it will stop here.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55472267
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ColumnPruningSuite.scala ---
    @@ -157,6 +157,14 @@ class ColumnPruningSuite extends PlanTest {
         comparePlans(Optimize.execute(query), expected)
       }
     
    +  test("Eliminate the Project with an empty projectList") {
    +    val input = OneRowRelation
    +    val query =
    +      Project(Literal(1).as("1") :: Nil, Project(Literal(1).as("1") :: Nil, input)).analyze
    --- End diff --
    
    Let me add an empty List too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481794
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Yea. As I posted before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194126378
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52721/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194141048
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52724/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55481593
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Without the fix, I hit the error:
    
    ```
    == FAIL: Plans do not match ===
     Project [1 AS 1#0]      Project [1 AS 1#0]
    !+- Project              +- OneRowRelation$
    !   +- OneRowRelation$   
             
    ScalaTestFailureLocation: org.apache.spark.sql.catalyst.plans.PlanTest at (PlanTest.scala:59)
    org.scalatest.exceptions.TestFailedException: 
    == FAIL: Plans do not match ===
     Project [1 AS 1#0]      Project [1 AS 1#0]
    !+- Project              +- OneRowRelation$
    !   +- OneRowRelation$   
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11599#discussion_r55482082
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -380,6 +380,9 @@ object ColumnPruning extends Rule[LogicalPlan] {
             p
           }
     
    +    // Eliminate the Projects with empty projectList
    +    case p @ Project(projectList, child) if projectList.isEmpty => child
    --- End diff --
    
    Thanks @viirya @cloud-fan !
    
    I am not sure which way is better. 
    ```scala
    case p @ Project(_, l: LeafNode) if !l.isInstanceOf[OneRowRelation] => p 
    ```
    My concern is the above line looks more hacky than the current PR fix. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-13763] [SQL] Remove Project when its pr...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/11599#issuecomment-194108085
  
    cc @marmbrus @cloud-fan @dilipbiswal 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org