You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/06/25 21:07:17 UTC

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/13906

    [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

    ## What changes were proposed in this pull request?
    
    This PR adds a new logical optimizer, `CollapseEmptyPlan`, to collapse a logical plans consisting of only empty LocalRelations. The only exceptional logical plan is aggregation. For aggregation plan, only simple cases are consider for this optimization.
    
    **Before**
    ```scala
    scala> sql("select a from values (1,2) T(a,b) where 1=0 group by a,b having a>1 order by a,b").explain
    == Physical Plan ==
    *Project [a#11]
    +- *Sort [a#11 ASC, b#12 ASC], true, 0
       +- Exchange rangepartitioning(a#11 ASC, b#12 ASC, 200)
          +- *HashAggregate(keys=[a#11, b#12], functions=[])
             +- Exchange hashpartitioning(a#11, b#12, 200)
                +- *HashAggregate(keys=[a#11, b#12], functions=[])
                   +- LocalTableScan <empty>, [a#11, b#12]
    ```
    
    **After**
    ```scala
    scala> sql("select a from values (1,2) T(a,b) where 1=0 group by a,b having a>1 order by a,b").explain
    == Physical Plan ==
    LocalTableScan <empty>, [a#0]
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests (including a new testsuite).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-16208

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13906.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13906
    
----
commit 7ddf449d39f22090bc8aa157fae12c79ba00928e
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-06-25T09:18:44Z

    [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69225940
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelationSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class PropagateEmptyRelationSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("PropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        PropagateEmptyRelation) :: Nil
    +  }
    +
    +  object OptimizeWithoutPropagateEmptyRelation extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("OptimizeWithoutPropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    --- End diff --
    
    can you also update the test names? like `propagate empty relation through Union`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69135425
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    --- End diff --
    
    `DeclarativeAggregate` -> aggregate function?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61242/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69225991
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelationSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class PropagateEmptyRelationSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("PropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        PropagateEmptyRelation) :: Nil
    +  }
    +
    +  object OptimizeWithoutPropagateEmptyRelation extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("OptimizeWithoutPropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    +    val query = testRelation1
    +      .where(false)
    +      .union(testRelation2.where(false))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Binary Logical Plans - Collapse joins") {
    +    // Testcases are tuples of (left predicate, right predicate, joinType, correct answer)
    +    // Note that `None` is used to compare with OptimizeWithoutPropagateEmptyRelation.
    +    val testcases = Seq(
    +      (true, true, Inner, None),
    +      (true, true, LeftOuter, None),
    +      (true, true, RightOuter, None),
    +      (true, true, FullOuter, None),
    +      (true, true, LeftAnti, None),
    +      (true, true, LeftSemi, None),
    +
    +      (true, false, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (true, false, LeftOuter, None),
    +      (true, false, RightOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (true, false, FullOuter, None),
    +      (true, false, LeftAnti, None),
    +      (true, false, LeftSemi, None),
    +
    +      (false, true, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (false, true, LeftOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, true, RightOuter, None),
    +      (false, true, FullOuter, None),
    +      (false, true, LeftAnti, Some(LocalRelation('a.int))),
    +      (false, true, LeftSemi, Some(LocalRelation('a.int))),
    +
    +      (false, false, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, LeftOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, RightOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, FullOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, LeftAnti, Some(LocalRelation('a.int))),
    +      (false, false, LeftSemi, Some(LocalRelation('a.int)))
    +    )
    +
    +    testcases.foreach { case (left, right, jt, answer) =>
    +      val query = testRelation1
    +        .where(left)
    +        .join(testRelation2.where(right), joinType = jt, condition = Some('a.attr == 'b.attr))
    +      val optimized = Optimize.execute(query.analyze)
    +      val correctAnswer =
    +        answer.getOrElse(OptimizeWithoutPropagateEmptyRelation.execute(query.analyze))
    +      comparePlans(optimized, correctAnswer)
    +    }
    +  }
    +
    +  test("Unary Logical Plans - Collapse one empty local relation") {
    +    val query = testRelation1
    +      .where(false)
    +      .select('a)
    +      .groupBy('a)('a)
    +      .where('a > 1)
    +      .orderBy('a.asc)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Don't collapse one non-empty local relation") {
    +    val query = testRelation1
    +      .where(true)
    +      .groupBy('a)('a)
    +      .where('a > 1)
    +      .orderBy('a.asc)
    +      .select('a)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = testRelation1
    +      .where('a > 1)
    +      .groupBy('a)('a)
    +      .orderBy('a.asc)
    +      .select('a)
    +
    +    comparePlans(optimized, correctAnswer.analyze)
    +  }
    +
    +  test("Unary Logical Plans - Collapse non-aggregating expressions on empty plan") {
    +    val query = testRelation1
    +      .where(false)
    +      .groupBy('a)('a)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int).analyze
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Collapse non-aggregating complex expressions on empty plan") {
    +    val query = testRelation1
    +      .where(false)
    +      .groupBy('a)('a, ('a + 1).as('x))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int, 'x.int).analyze
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Don't collapse aggregating expressions on empty plan") {
    --- End diff --
    
    how about `Don't propagate empty relation through Aggregate with aggregate function`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Yep. It's done!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69141347
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    +    val query = testRelation1
    +      .where(false)
    +      .union(testRelation2.where(false))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Binary Logical Plans - Collapse joins") {
    --- End diff --
    
    ok then it's fine


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68837197
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    can you actually move this into a separate file? the optimizer is becoming too large and I want to break it apart soon. No point adding new things in this file.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69111733
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary(or Higher) Logical Plans - Collapse empty union") {
    --- End diff --
    
    Yep. I removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69109340
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    --- End diff --
    
    Why `DeclarativeAggregate` rather than `AggregateFunction`? `AggregateFunction` also covers `ImperativeAggregate` like `ScalaUDAF`, which should also be covered here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69111287
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary(or Higher) Logical Plans - Collapse empty union") {
    --- End diff --
    
    Just here or for all? Actually, I added that for `Union`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68495048
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,34 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    --- End diff --
    
    this kind of blacklisting approach is too risky -- if we were to introduce a new logical node in the future, most likely we will forget to update this rule.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69065025
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) =>
    --- End diff --
    
    I think this rule is very useful, we can avoid scanning one join side if the other side is empty


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69064054
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) =>
    +      LocalRelation(p.output, data = Seq.empty)
    +
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    --- End diff --
    
    Actually `Generate` can't be included here. Our `Generate` also support Hive style UDTF, which has a weird semantics: for a UDTF `f`, after all rows being processed, `f.close()` will be called, and *more rows can be generated* within `f.close()`. This means a UDTF may generate one or more rows even if the underlying input is empty.
    
    See [here][1] and PR #5338 for more details.
    
    [1]: https://github.com/apache/spark/pull/5383/files


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61409/consoleFull)** for PR 13906 at commit [`bd79a14`](https://github.com/apache/spark/commit/bd79a14b26f46cda7fdc4db66a7ee5bfdeeddc88).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Thank you again! The following is updated.
    - Rename `CollapseEmptyPlan` to `PropagateEmptyRelation`.
      - CollapseEmptyPlan.scala -> PropagateEmptyRelation.scala
      - CollapseEmptyPlanSuite -> PropagateEmptyRelationSuite.scala
    - Remove redundant predicate `children.length > 1` in `Union` case.
    - Remove unused `testRelation3`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61363 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61363/consoleFull)** for PR 13906 at commit [`c06ae60`](https://github.com/apache/spark/commit/c06ae6011985d3c839bea372333b0d5a6491f55d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61462/consoleFull)** for PR 13906 at commit [`4bc2452`](https://github.com/apache/spark/commit/4bc24521b32df7da5758a0ecd5620e78d311c526).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69131609
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def containsAggregateExpression(e: Expression): Boolean = {
    +    e.collectFirst { case _: AggregateFunction => () }.isDefined
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    --- End diff --
    
    `Union.resolved` already guarantee `children.length > 1`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69304216
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala ---
    @@ -0,0 +1,78 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without AggregateFunction expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object PropagateEmptyRelation extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def containsAggregateExpression(e: Expression): Boolean = {
    +    e.collectFirst { case _: AggregateFunction => () }.isDefined
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    --- End diff --
    
    sorry to be so nit-pick, but it seems better to separate the join case, e.g.
    ```
    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) => empty(p)
    case p @ Join(_, _, LeftOuter | LeftSemi | LeftAnti, _) if isEmptyLocalRelation(p.left) => empty(p)
    ...
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    @cloud-fan Yea, that's a good point.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61581/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61535 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61535/consoleFull)** for PR 13906 at commit [`9ae965a`](https://github.com/apache/spark/commit/9ae965ae30a08a0f88850de94318640c0bda11f4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68701978
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    +      LocalRelation(a.output, data = Seq.empty)
    +
    +    // Case 2: General aggregations can generate non-empty results.
    +    case a: Aggregate => a
    +
    +    // Case 3: The following non-leaf plans having only empty relations return empty results.
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    --- End diff --
    
    Yep, right! I'll add them explicitly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61535 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61535/consoleFull)** for PR 13906 at commit [`9ae965a`](https://github.com/apache/spark/commit/9ae965ae30a08a0f88850de94318640c0bda11f4).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61409/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    LGTM except some naming comments, thanks for working on it!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68495422
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,34 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    --- End diff --
    
    Thank you for review, @rxin .
    I see. I will update this PR into whitelist approach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68701768
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,41 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    +      LocalRelation(a.output, data = Seq.empty)
    +
    +    // Case 2: General aggregations can generate non-empty results.
    +    case a: Aggregate => a
    +
    +    // Case 3: The following non-leaf plans having only empty relations return empty results.
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    --- End diff --
    
    actually for intersect you only need one child to be empty
    
    for join if it is inner join you just need one child to be empty too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68838412
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    Sure. What name is suitable for this optimizer?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68701620
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,133 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    --- End diff --
    
    you should test something that shouldn't have been converted too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61449/consoleFull)** for PR 13906 at commit [`4d937dc`](https://github.com/apache/spark/commit/4d937dc83da661b24d2af1dd513687f4a63b29b0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13906


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69089371
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) =>
    +      LocalRelation(p.output, data = Seq.empty)
    +
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    +             _: Sort | _: GlobalLimit | _: LocalLimit | _: Union | _: Repartition =>
    +          LocalRelation(p.output, data = Seq.empty)
    +        case Aggregate(ge, ae, _) if ae.forall(ge.contains(_)) =>
    --- End diff --
    
    Yea. The following predicate should work:
    
    ```scala
    ae.forall(_.collectFirst { case _: AggregateExpression => () }.isEmpty)
    ```
    
    (But probably put it into a separate method though.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    LGTM except for those comments @cloud-fan brought up. Thanks for working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    @liancheng , I think we still need to keep some simple rules for unary node, which also helps the binary cases, as the empty relation is propagated up.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    It sounds promising. Maybe, Spark 2.1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69172328
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    --- End diff --
    
    Yep. I'll remove this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69227494
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelationSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class PropagateEmptyRelationSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("PropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        PropagateEmptyRelation) :: Nil
    +  }
    +
    +  object OptimizeWithoutPropagateEmptyRelation extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("OptimizeWithoutPropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    +    val query = testRelation1
    +      .where(false)
    +      .union(testRelation2.where(false))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Binary Logical Plans - Collapse joins") {
    +    // Testcases are tuples of (left predicate, right predicate, joinType, correct answer)
    +    // Note that `None` is used to compare with OptimizeWithoutPropagateEmptyRelation.
    +    val testcases = Seq(
    +      (true, true, Inner, None),
    +      (true, true, LeftOuter, None),
    +      (true, true, RightOuter, None),
    +      (true, true, FullOuter, None),
    +      (true, true, LeftAnti, None),
    +      (true, true, LeftSemi, None),
    +
    +      (true, false, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (true, false, LeftOuter, None),
    +      (true, false, RightOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (true, false, FullOuter, None),
    +      (true, false, LeftAnti, None),
    +      (true, false, LeftSemi, None),
    +
    +      (false, true, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (false, true, LeftOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, true, RightOuter, None),
    +      (false, true, FullOuter, None),
    +      (false, true, LeftAnti, Some(LocalRelation('a.int))),
    +      (false, true, LeftSemi, Some(LocalRelation('a.int))),
    +
    +      (false, false, Inner, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, LeftOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, RightOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, FullOuter, Some(LocalRelation('a.int, 'b.int))),
    +      (false, false, LeftAnti, Some(LocalRelation('a.int))),
    +      (false, false, LeftSemi, Some(LocalRelation('a.int)))
    +    )
    +
    +    testcases.foreach { case (left, right, jt, answer) =>
    +      val query = testRelation1
    +        .where(left)
    +        .join(testRelation2.where(right), joinType = jt, condition = Some('a.attr == 'b.attr))
    +      val optimized = Optimize.execute(query.analyze)
    +      val correctAnswer =
    +        answer.getOrElse(OptimizeWithoutPropagateEmptyRelation.execute(query.analyze))
    +      comparePlans(optimized, correctAnswer)
    +    }
    +  }
    +
    +  test("Unary Logical Plans - Collapse one empty local relation") {
    +    val query = testRelation1
    +      .where(false)
    +      .select('a)
    +      .groupBy('a)('a)
    +      .where('a > 1)
    +      .orderBy('a.asc)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Don't collapse one non-empty local relation") {
    +    val query = testRelation1
    +      .where(true)
    +      .groupBy('a)('a)
    +      .where('a > 1)
    +      .orderBy('a.asc)
    +      .select('a)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = testRelation1
    +      .where('a > 1)
    +      .groupBy('a)('a)
    +      .orderBy('a.asc)
    +      .select('a)
    +
    +    comparePlans(optimized, correctAnswer.analyze)
    +  }
    +
    +  test("Unary Logical Plans - Collapse non-aggregating expressions on empty plan") {
    +    val query = testRelation1
    +      .where(false)
    +      .groupBy('a)('a)
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int).analyze
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Collapse non-aggregating complex expressions on empty plan") {
    +    val query = testRelation1
    +      .where(false)
    +      .groupBy('a)('a, ('a + 1).as('x))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int, 'x.int).analyze
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Unary Logical Plans - Don't collapse aggregating expressions on empty plan") {
    --- End diff --
    
    Sounds good!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61598/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Thank you for merging, @liancheng , @cloud-fan , and @rxin .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61260 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61260/consoleFull)** for PR 13906 at commit [`bd39437`](https://github.com/apache/spark/commit/bd3943701a37dc6e98193b3531a85e0ae4756f6f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61598 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61598/consoleFull)** for PR 13906 at commit [`984854b`](https://github.com/apache/spark/commit/984854b43444be928c2f22f5d16a9531e46292c3).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69111430
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    +      case Inner => empty(p)
    +      case LeftOuter | LeftSemi | LeftAnti if isEmptyLocalRelation(p.left) => empty(p)
    +      case RightOuter if isEmptyLocalRelation(p.right) => empty(p)
    +      case FullOuter if p.children.forall(isEmptyLocalRelation) => empty(p)
    +      case _ => p
    +    }
    +
    +    case p: UnaryNode if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) => p match {
    +      case _: Project | _: Filter | _: Sample | _: Sort | _: GlobalLimit | _: LocalLimit |
    +           _: Repartition | _: RepartitionByExpression => empty(p)
    +      case Aggregate(_, ae, _) if !ae.exists(isDeclarativeAggregate) => empty(p)
    +      case Generate(_ : Explode, _, _, _, _, _) => empty(p)
    --- End diff --
    
    Sure!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69109823
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    +      case Inner => empty(p)
    +      case LeftOuter | LeftSemi | LeftAnti if isEmptyLocalRelation(p.left) => empty(p)
    +      case RightOuter if isEmptyLocalRelation(p.right) => empty(p)
    +      case FullOuter if p.children.forall(isEmptyLocalRelation) => empty(p)
    +      case _ => p
    +    }
    --- End diff --
    
    Could you please comment that `Intersect` is also covered here? I didn't realized that we've already translated `Intersect` using joins at first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61449/consoleFull)** for PR 13906 at commit [`4d937dc`](https://github.com/apache/spark/commit/4d937dc83da661b24d2af1dd513687f4a63b29b0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69141440
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    --- End diff --
    
    seems this is never used?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61535/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68838170
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    +      LocalRelation(a.output, data = Seq.empty)
    +
    +    // Case 2: General aggregations can generate non-empty results.
    +    case a: Aggregate => a
    +
    +    // Case 3: The following plans having only empty relations return empty results.
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    +             _: Sort | _: GlobalLimit | _: LocalLimit |
    +             _: Distinct | _: Except | _: Union |
    +             _: Repartition =>
    +          LocalRelation(p.output, data = Seq.empty)
    +        case _ => p
    +      }
    +
    +    // Case 4: The following plans having at least one empty relation return empty results.
    +    case p: LogicalPlan if p.children.exists(isEmptyLocalRelation) =>
    +      p match {
    +        case Join(_, _, Inner, _) | _: Intersect =>
    --- End diff --
    
    Sure!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69171076
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def containsAggregateExpression(e: Expression): Boolean = {
    +    e.collectFirst { case _: AggregateFunction => () }.isDefined
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    --- End diff --
    
    Oh, thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61538/consoleFull)** for PR 13906 at commit [`9744c9f`](https://github.com/apache/spark/commit/9744c9f85de5ab1e722c686922da788bf62d442c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Sounds interesting. You mean `LocalNode` that computes all local node operators on `LocalRelation`, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69135765
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    +    val query = testRelation1
    +      .where(false)
    +      .union(testRelation2.where(false))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Binary Logical Plans - Collapse joins") {
    --- End diff --
    
    This test is too verbose... Can you try to simplify it? We don't need to cover all cases but some typical ones


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69110423
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    +      case Inner => empty(p)
    +      case LeftOuter | LeftSemi | LeftAnti if isEmptyLocalRelation(p.left) => empty(p)
    +      case RightOuter if isEmptyLocalRelation(p.right) => empty(p)
    +      case FullOuter if p.children.forall(isEmptyLocalRelation) => empty(p)
    +      case _ => p
    +    }
    --- End diff --
    
    Yep. I added line 30 https://github.com/apache/spark/pull/13906/files#diff-315910d950ea08479990c40570fbd216R30 currently.
    
    But, I will add comment, too. Maybe beyond `LeftSemi` at line 56?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61409/consoleFull)** for PR 13906 at commit [`bd79a14`](https://github.com/apache/spark/commit/bd79a14b26f46cda7fdc4db66a7ee5bfdeeddc88).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    It's the exact same `Scala 2.10 match reachability build error` at previous https://github.com/apache/spark/commit/fda51f171450d03769242fa7829cbb8b6e0e627b . 
    
    The solution is the same. I split the `case` statements.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68836963
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    +      LocalRelation(a.output, data = Seq.empty)
    +
    +    // Case 2: General aggregations can generate non-empty results.
    +    case a: Aggregate => a
    +
    +    // Case 3: The following plans having only empty relations return empty results.
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    +             _: Sort | _: GlobalLimit | _: LocalLimit |
    +             _: Distinct | _: Except | _: Union |
    +             _: Repartition =>
    +          LocalRelation(p.output, data = Seq.empty)
    +        case _ => p
    +      }
    +
    +    // Case 4: The following plans having at least one empty relation return empty results.
    +    case p: LogicalPlan if p.children.exists(isEmptyLocalRelation) =>
    +      p match {
    +        case Join(_, _, Inner, _) | _: Intersect =>
    --- End diff --
    
    can we move this out rather than doing a two level nesting


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69110054
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    --- End diff --
    
    Thanks, again!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69109977
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    --- End diff --
    
    Oh. I forgot your comment about this. I'll fix like that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68703430
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,133 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    --- End diff --
    
    I'll update to have more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Some more thoughts about it: Currently we only execute `Project` locally if the underlying relation is `LocalRelation`. Is it possible that we can also execute `Aggregate` locally?  If it's too hard I'm ok to handle empty relation first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61552 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61552/consoleFull)** for PR 13906 at commit [`80760db`](https://github.com/apache/spark/commit/80760db48e62648bf71e39ce196517b5afcd00df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69140767
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    +    val query = testRelation1
    +      .where(false)
    +      .union(testRelation2.where(false))
    +
    +    val optimized = Optimize.execute(query.analyze)
    +    val correctAnswer = LocalRelation('a.int)
    +
    +    comparePlans(optimized, correctAnswer)
    +  }
    +
    +  test("Binary Logical Plans - Collapse joins") {
    --- End diff --
    
    Well, I guess the verbosity is necessary since it guarantees that we don't do wrong empty relation propagation for certain types of joins (e.g. outer join with only one empty child).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61449/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69135496
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    how about `PropagateEmptyRelation`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68839609
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    yup


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68495438
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,34 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case x if x.isInstanceOf[ObjectProducer] || x.isInstanceOf[ObjectConsumer] => x
    +
    +    // Case 1: If groupingExpressions contains all aggregation expressions, the result is empty.
    +    case a @ Aggregate(ge, ae, child) if isEmptyLocalRelation(child) && ae.forall(ge.contains(_)) =>
    --- End diff --
    
    Ur, at the first, I thought you meant line 1080 for `case p: LogicalPlan`,
    
    https://github.com/apache/spark/pull/13906/files#diff-a636a87d8843eeccca90140be91d4fafR1080 .
    
    Did I understand your advice correctly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Anyway, thank you for review again, @rxin !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    By the way, for complexity, it's 23 line optimizer without blank/comments. In fact, it's less than `NullPropagation`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    For 3, I respect your opinion. I just make another commit for 2.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    I'm not sure if this optimization is useful
    
    1. empty `LocalRelation` is a corner case and seems not worth to optimize.
    2. the optimization rule in this PR is kind of complex.
    3. if we have better handling for `LocalRelation` in the futuren(like the `LocalNode`), this rule will become useless.
    
    cc @marmbrus @yhuai


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61552 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61552/consoleFull)** for PR 13906 at commit [`80760db`](https://github.com/apache/spark/commit/80760db48e62648bf71e39ce196517b5afcd00df).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69175695
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    --- End diff --
    
    Since the optimizer's name, `PropagateEmptyRelation`, is intuitive and nice, I tried to write something like `Propagate ...` for this heading line. But I feel the current description is more descriptive for the behavior of this class here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    My feeling is that, this optimization rule is mostly useful for binary plan nodes like inner join and intersection, where we can avoid scanning output of the non-empty side.
    
    On the other hand, for unary plan nodes, firstly it doesn't bring much performance benefits, especially when whole stage codegen is enabled; secondly there are non-obvious and tricky corner cases, like `Aggregate` and `Generate`.
    
    That said, although this patch is not a big one, it does introduce non-trivial complexities. For example, I didn't immediately realize that why `Aggregate` must be special cased at first (`COUNT(x)` may return 0 for empty input). The `Generate` case is even trickier.
    
    So my suggestion is to only implement this rule for inner join and intersection, which are much simpler to handle. what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69110745
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    +      case Inner => empty(p)
    +      case LeftOuter | LeftSemi | LeftAnti if isEmptyLocalRelation(p.left) => empty(p)
    +      case RightOuter if isEmptyLocalRelation(p.right) => empty(p)
    +      case FullOuter if p.children.forall(isEmptyLocalRelation) => empty(p)
    +      case _ => p
    +    }
    +
    +    case p: UnaryNode if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) => p match {
    +      case _: Project | _: Filter | _: Sample | _: Sort | _: GlobalLimit | _: LocalLimit |
    +           _: Repartition | _: RepartitionByExpression => empty(p)
    +      case Aggregate(_, ae, _) if !ae.exists(isDeclarativeAggregate) => empty(p)
    +      case Generate(_ : Explode, _, _, _, _, _) => empty(p)
    --- End diff --
    
    Let's add comment here to explain why we must special case `Aggregate` and `Generate`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68838472
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -1053,6 +1055,49 @@ object PruneFilters extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    + * Collapse plans consisting all empty local relations generated by [[PruneFilters]].
    + * Note that the ObjectProducer/Consumer and direct aggregations are the exceptions.
    + * {{{
    + *   SELECT a, b FROM t WHERE 1=0 GROUP BY a, b ORDER BY a, b ==> empty result
    + *   SELECT SUM(a) FROM t WHERE 1=0 GROUP BY a HAVING COUNT(*)>1 ORDER BY a (Not optimized)
    + * }}}
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    Just `CollapseEmptyPlan.scala` is okay?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Oh, please hold on merging this PR. Scala 2.10 reports some errors. I'll inform you again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r68701793
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,133 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans.{LeftAnti, PlanTest}
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    --- End diff --
    
    Ur, any other scenario except the existing followings?
    - test("one non-empty local relation")
    - test("one non-empty and one empty local relations")
    - test("aggregating expressions on empty plan")


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged to master.
    
    @cloud-fan Sorry that I didn't notice your comment while merging it. We may address it in follow-up ones.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    The title and description of PR/JIRA are updated consistently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61552/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Thank you, @cloud-fan .
    It seems to be a good idea to handle operators on LocalRelations. 
    But, if possible, may I dig that on another PR? :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61538/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Thank YOU, @cloud-fan !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69109611
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    --- End diff --
    
    I'd replace this method with:
    
    ```scala
    def containsAggregateExpression(e: Expression): Boolean = {
      e.collectFirst { case _: AggregateFunction => () }.isDefined
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61260 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61260/consoleFull)** for PR 13906 at commit [`bd39437`](https://github.com/apache/spark/commit/bd3943701a37dc6e98193b3531a85e0ae4756f6f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61363 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61363/consoleFull)** for PR 13906 at commit [`c06ae60`](https://github.com/apache/spark/commit/c06ae6011985d3c839bea372333b0d5a6491f55d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61462/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61538/consoleFull)** for PR 13906 at commit [`9744c9f`](https://github.com/apache/spark/commit/9744c9f85de5ab1e722c686922da788bf62d442c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Thank you so much, @liancheng and @cloud-fan !
    I indeed focused on only `UnaryNode` plans before.
    Now I've learned your view points, and reconsider `BinaryNode` logical plans, mainly Join.
    
    I update the code and PR description according to the comments.
    - Add more Join optimizations.  (@liancheng)
      (Intersect is considered in `LeftSemi` by `ReplaceIntersectWithSemiJoin`)
    - Exclude all `Generator` except `Explode`. (@liancheng)
    - Handle `SELECT col + 1 … GROUP BY col`. (@cloud-fan)
    - Change `isEmptyLocalRelation` function to use pattern match. (@liancheng)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69065541
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) =>
    --- End diff --
    
    Yea, we can also add `Intersect`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` o...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69227072
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelationSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class PropagateEmptyRelationSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("PropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        PropagateEmptyRelation) :: Nil
    +  }
    +
    +  object OptimizeWithoutPropagateEmptyRelation extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("OptimizeWithoutPropagateEmptyRelation", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +
    +  test("Binary Logical Plans - Collapse empty union") {
    --- End diff --
    
    No problem!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69065425
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    --- End diff --
    
    ```scala
    plan match {
      case p: LocalRelation => p.data.isEmpty
      case _ => false
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61363/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69111075
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala ---
    @@ -0,0 +1,173 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.Row
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.dsl.plans._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
    +import org.apache.spark.sql.catalyst.rules.RuleExecutor
    +
    +class CollapseEmptyPlanSuite extends PlanTest {
    +  object Optimize extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters,
    +        CollapseEmptyPlan) :: Nil
    +  }
    +
    +  object OptimizeWithoutCollapseEmptyPlan extends RuleExecutor[LogicalPlan] {
    +    val batches =
    +      Batch("CollapseEmptyPlan", Once,
    +        CombineUnions,
    +        ReplaceDistinctWithAggregate,
    +        ReplaceExceptWithAntiJoin,
    +        ReplaceIntersectWithSemiJoin,
    +        PushDownPredicate,
    +        PruneFilters) :: Nil
    +  }
    +
    +  val testRelation1 = LocalRelation.fromExternalRows(Seq('a.int), data = Seq(Row(1)))
    +  val testRelation2 = LocalRelation.fromExternalRows(Seq('b.int), data = Seq(Row(1)))
    +  val testRelation3 = LocalRelation.fromExternalRows(Seq('c.int), data = Seq(Row(1)))
    +
    +  test("Binary(or Higher) Logical Plans - Collapse empty union") {
    --- End diff --
    
    Remove "(or Higher)".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Hi, @rxin , @cloud-fan , @liancheng . It's ready for merging again.
    Thank you so much for being together with this PR!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by liancheng <gi...@git.apache.org>.

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69140206
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,70 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.DeclarativeAggregate
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    + *    - Generate(Explode) with all empty children. Others like Hive UDTF may return results.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean = plan match {
    +    case p: LocalRelation => p.data.isEmpty
    +    case _ => false
    +  }
    +
    +  private def isDeclarativeAggregate(e: Expression): Boolean = e match {
    +    case _: DeclarativeAggregate => true
    +    case _: LeafExpression => false
    +    case other => other.children.forall(isDeclarativeAggregate)
    +  }
    +
    +  private def empty(plan: LogicalPlan) = LocalRelation(plan.output, data = Seq.empty)
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p: Union if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      empty(p)
    +
    +    case p @ Join(_, _, joinType, _) if p.children.exists(isEmptyLocalRelation) => joinType match {
    +      case Inner => empty(p)
    +      case LeftOuter | LeftSemi | LeftAnti if isEmptyLocalRelation(p.left) => empty(p)
    +      case RightOuter if isEmptyLocalRelation(p.right) => empty(p)
    +      case FullOuter if p.children.forall(isEmptyLocalRelation) => empty(p)
    +      case _ => p
    +    }
    --- End diff --
    
    Sorry that I didn't notice the comment above. I think that should be enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61581 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61581/consoleFull)** for PR 13906 at commit [`7ab1725`](https://github.com/apache/spark/commit/7ab17258590178de98656c836adefe0b1d3c0acb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    @liancheng . Now, the PR is updated again.
    - Use  `AggregateFunction` instead of `DeclarativeAggregate`
    - Add explicit comments about INTERSECT/EXCEPT/AGGREGATE/GENERATOR.
    - Fix testcase name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61260/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61581 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61581/consoleFull)** for PR 13906 at commit [`7ab1725`](https://github.com/apache/spark/commit/7ab17258590178de98656c836adefe0b1d3c0acb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    cc @cloud-fan to take a look too.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by JoshRosen <gi...@git.apache.org>.

Github user JoshRosen commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    I found a bug in the part of this rule which deals with `Aggregate`; see #17929 for the fix.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Hi, @rxin .
    I just remembered this PR while looking your whitelist PR. :)
    Any advice for this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61462/consoleFull)** for PR 13906 at commit [`4bc2452`](https://github.com/apache/spark/commit/4bc24521b32df7da5758a0ecd5620e78d311c526).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61242 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61242/consoleFull)** for PR 13906 at commit [`7ddf449`](https://github.com/apache/spark/commit/7ddf449d39f22090bc8aa157fae12c79ba00928e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    LGTM, pending jenkins


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61242 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61242/consoleFull)** for PR 13906 at commit [`7ddf449`](https://github.com/apache/spark/commit/7ddf449d39f22090bc8aa157fae12c79ba00928e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69064885
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,49 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. InnerJoin with one or two empty children.
    + * 2. Project/Generate/Filter/Sample/Join/Limit/Union/Repartition with all empty children.
    + * 3. Aggregate with all empty children and grpExprs containing all aggExprs.
    + */
    +object CollapseEmptyPlan extends Rule[LogicalPlan] with PredicateHelper {
    +  private def isEmptyLocalRelation(plan: LogicalPlan): Boolean =
    +    plan.isInstanceOf[LocalRelation] && plan.asInstanceOf[LocalRelation].data.isEmpty
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transformUp {
    +    case p @ Join(_, _, Inner, _) if p.children.exists(isEmptyLocalRelation) =>
    +      LocalRelation(p.output, data = Seq.empty)
    +
    +    case p: LogicalPlan if p.children.nonEmpty && p.children.forall(isEmptyLocalRelation) =>
    +      p match {
    +        case _: Project | _: Generate | _: Filter | _: Sample | _: Join |
    +             _: Sort | _: GlobalLimit | _: LocalLimit | _: Union | _: Repartition =>
    +          LocalRelation(p.output, data = Seq.empty)
    +        case Aggregate(ge, ae, _) if ae.forall(ge.contains(_)) =>
    --- End diff --
    
    what exactly are we checking here? it looks to me that we can do empty relation propagate if aggregate list has no aggregate function, e.g. `select col + 1 from tbl group by col` should also work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13906#discussion_r69171100
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala ---
    @@ -0,0 +1,72 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.optimizer
    +
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateFunction
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.rules._
    +
    +/**
    + * Collapse plans consisting empty local relations generated by [[PruneFilters]].
    + * 1. Binary(or Higher)-node Logical Plans
    + *    - Union with all empty children.
    + *    - Join with one or two empty children (including Intersect/Except).
    + * 2. Unary-node Logical Plans
    + *    - Project/Filter/Sample/Join/Limit/Repartition with all empty children.
    + *    - Aggregate with all empty children and without DeclarativeAggregate expressions like COUNT.
    --- End diff --
    
    Ooops. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13906
  
    **[Test build #61598 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61598/consoleFull)** for PR 13906 at commit [`984854b`](https://github.com/apache/spark/commit/984854b43444be928c2f22f5d16a9531e46292c3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org