You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@spark.apache.org by vi...@apache.org on 2021/11/12 22:31:59 UTC

[spark] branch master updated: [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new aaf0e5e  [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side
aaf0e5e is described below

commit aaf0e5e71509a2324e110e45366b753c7926c64b
Author: Yuming Wang <yu...@ebay.com>
AuthorDate: Fri Nov 12 14:30:47 2021 -0800

    [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side
    
    ### What changes were proposed in this pull request?
    
    Simplify the condition when removing outer join if it only has DISTINCT on streamed side with alias. See: https://github.com/apache/spark/pull/34557#discussion_r748005299.
    
    ### Why are the changes needed?
    
    Simplify the code.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Existing unit test.
    
    Closes #34573 from wangyum/SPARK-37292-2.
    
    Authored-by: Yuming Wang <yu...@ebay.com>
    Signed-off-by: Liang-Chi Hsieh <vi...@gmail.com>
---
 .../main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala  | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
index f9f2e83..e03360d 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala
@@ -179,12 +179,10 @@ object EliminateOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
         if a.groupOnly && a.references.subsetOf(right.outputSet) =>
       a.copy(child = right)
     case a @ Aggregate(_, _, p @ Project(_, Join(left, _, LeftOuter, _, _)))
-        if a.groupOnly && p.outputSet.subsetOf(a.references) &&
-          AttributeSet(p.projectList.flatMap(_.references)).subsetOf(left.outputSet) =>
+        if a.groupOnly && p.references.subsetOf(left.outputSet) =>
       a.copy(child = p.copy(child = left))
     case a @ Aggregate(_, _, p @ Project(_, Join(_, right, RightOuter, _, _)))
-        if a.groupOnly && p.outputSet.subsetOf(a.references) &&
-          AttributeSet(p.projectList.flatMap(_.references)).subsetOf(right.outputSet) =>
+        if a.groupOnly && p.references.subsetOf(right.outputSet) =>
       a.copy(child = p.copy(child = right))
   }
 }

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org