You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2015/12/23 20:54:58 UTC

[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/10454

    [SPARK-12505] [SQL] Pushdown a Limit on top of an Outer-Join

    "Rule that applies to a Limit on top of an OUTER Join. The original Limit won't go away after applying this rule, but additional Limit node(s) will be created on top of the outer-side child (or children if it's a FULL OUTER Join). "
    – from https://issues.apache.org/jira/browse/CALCITE-832
    
    Also, the same topic in Hive: https://issues.apache.org/jira/browse/HIVE-11684

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark outerjoinLimit

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10454.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10454
    
----
commit 8df3e0dfe99a8428b0569ba9d50afbda1c3d1591
Author: gatorsmile <ga...@gmail.com>
Date:   2015-12-23T19:48:07Z

    outerjoin limit pushdown.

commit 3779874f4e2add9f27b4bf64bd6f82668d7475ce
Author: gatorsmile <ga...@gmail.com>
Date:   2015-12-23T19:52:47Z

    revert.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-170757756
  
    Sure, let me close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48386499
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    hmm. Actually, for left/right outer join, what will happen if we have `A left outer join B on (A.key = B.key) sort by A.key limit 10`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388764
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    Let me update the code and fix the bug. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48376815
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,26 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, Limit(expr, right), joinType, joinCondition))
    +        case LeftOuter =>
    +          Limit(expr, Join(Limit(expr, left), right, joinType, joinCondition))
    +        case FullOuter =>
    +          Limit(expr, Join(Limit(expr, left), Limit(expr, right), joinType, joinCondition))
    +        case _ => f // DO Nothing for the other join types
    +      }
    +  }
    +}
    --- End diff --
    
    Is it possible that we add an extra limit in every iteration ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167005833
  
    **[Test build #48263 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48263/consoleFull)** for PR 10454 at commit [`72f73fb`](https://github.com/apache/spark/commit/72f73fb8ebad3f48cf77515abdad4be866efafbb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-166981787
  
    **[Test build #48252 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48252/consoleFull)** for PR 10454 at commit [`3779874`](https://github.com/apache/spark/commit/3779874f4e2add9f27b4bf64bd6f82668d7475ce).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile closed the pull request at:

    https://github.com/apache/spark/pull/10454


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-166993610
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48392176
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, CombineLimits(Limit(expr, right)), joinType, joinCondition))
    --- End diff --
    
    Since we call `CombineLimits` here to combine two consecutive `Limit`s, we will not add extra `Limit` in the subsequent iteration. Thus, no change will be made in the plan. Thus, `RuleExecutor` will stop automatically. Am I right? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48386235
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    For left outer joins, we only push down the limit to the left table. Thus, I think it is safe?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48395965
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, CombineLimits(Limit(expr, right)), joinType, joinCondition))
    --- End diff --
    
    Thank you! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167014161
  
    **[Test build #48263 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48263/consoleFull)** for PR 10454 at commit [`72f73fb`](https://github.com/apache/spark/commit/72f73fb8ebad3f48cf77515abdad4be866efafbb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-166993285
  
    **[Test build #48252 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48252/consoleFull)** for PR 10454 at commit [`3779874`](https://github.com/apache/spark/commit/3779874f4e2add9f27b4bf64bd6f82668d7475ce).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48391896
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, CombineLimits(Limit(expr, right)), joinType, joinCondition))
    --- End diff --
    
    need a stop condition to stop pushing `Limit` if it's already pushed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48387290
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    If there is a sort, the plan is different. We will have a `Sort` below `Limit`. Thus, it is not applicable to this rule. 
    
    For the full-outer join, I think the result is still correct. 
    
    If the nodes below `limit` is deterministic, the result should be deterministic. Right? Sorry, this part is unclear to me. 
    
    Based on my understanding, when we use `limit`, the users will not expect a deterministic results, unless they use `order by`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167066005
  
    **[Test build #48298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48298/consoleFull)** for PR 10454 at commit [`ee91303`](https://github.com/apache/spark/commit/ee913037ca87f16448642c0ed5e23d6b8b07c341).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48378249
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,26 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, Limit(expr, right), joinType, joinCondition))
    +        case LeftOuter =>
    +          Limit(expr, Join(Limit(expr, left), right, joinType, joinCondition))
    +        case FullOuter =>
    +          Limit(expr, Join(Limit(expr, left), Limit(expr, right), joinType, joinCondition))
    +        case _ => f // DO Nothing for the other join types
    +      }
    +  }
    +}
    --- End diff --
    
    You are right, but, after this rule, there exists another rule `CombineLimits`, which can combine the extra limits. I think we should still fix it anyway. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167077728
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167014215
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167077669
  
    **[Test build #48298 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48298/consoleFull)** for PR 10454 at commit [`ee91303`](https://github.com/apache/spark/commit/ee913037ca87f16448642c0ed5e23d6b8b07c341).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388871
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    For the `full outer` join, my idea is to push down one side which has a higher `statistics`. Does that sound good?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-166988730
  
    Can you add an example in the description to illustrate this optimization?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48387575
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    Can you explain why the full outer join is good?
    
    Yes, the returned results of a limit will not be deterministic unless there is a sort.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-170757373
  
    @gatorsmile let's close the limit push down pull requests. We will need to design this more properly because it is expensive to push down large limits.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167014216
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48263/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48377203
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,26 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, Limit(expr, right), joinType, joinCondition))
    +        case LeftOuter =>
    +          Limit(expr, Join(Limit(expr, left), right, joinType, joinCondition))
    +        case FullOuter =>
    +          Limit(expr, Join(Limit(expr, left), Limit(expr, right), joinType, joinCondition))
    +        case _ => f // DO Nothing for the other join types
    +      }
    +  }
    +}
    --- End diff --
    
    True... Will fix it soon. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48393687
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, CombineLimits(Limit(expr, right)), joinType, joinCondition))
    --- End diff --
    
    ah, it's right, nvm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48386195
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    I feel we will generate wrong results. It is not safe to push down the limit if we are not joining on the foreign key with the primary key, right? For example, for a left outer join, we push down the limit to the right table. It is possible all rows returned by the right side are having the same join column value.
    
    am I missing anything?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48389162
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    Need to refine the idea. 
    
    For the `full outer` join, 
    1) if both sides have `limit`. Do nothing. 
    2) if only one side has a `limit`, add extra `limit` to that side. 
    3) if both sides do not have a limit,  add extra `limit` to one side which has a higher `statistics`
    
    Is it better?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388692
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    For example, we have the a table A having 1, 2, 3, 4, 5 (say k is the column name). If we do `A x FULL OUTER JOIN A y ON (x.k = y.k) limit 2` and we push limit to both side, it is possible that we get `1,2` from the left side and `3, 4` from the right side, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167138258
  
    The new fix contains two parts:
    
    1) Since the full outer join will remove the duplicates, we are unable to add the extra `limit` to both sides. As long as we can ensure the completeness of one Child, the generated results will be still correct like the left/right outer join. 
    
    2) There are `Projection` on top of most `Join` operators. We also can push through this kind of combination. 
    
    Thank you for your reviews and comments! @yhuai @cloud-fan @viirya 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388339
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    I am not sure what is the best way to prove it. If we can add extra `limit` below `union all`, `left outer` and `right outer`,  can we add extra `limit` below `full outer`? In the traditional RDBMS, ```full outer = union all(left outer, right outer)```. I am not sure if Spark SQL has the same semantics. 
    
    `(A full outer join B) limit 5` 
    = `(A left outer join B) limit 5` `union all` `(A right outer join B) limit 5`
    = `((A limit 5) left outer join (B limit 5))` `union all` `((A limit 5) right outer join (B limit 5))` 
    = `((A limit 5) full outer join (B limit 5))`
    
    
    However, inner join has a big issue if we are trying to add extra `limit`. I am not sure my answer is clear. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48385881
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    --- End diff --
    
    We also need to push it down through the `Join`'s `Projection`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-167077729
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48298/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48388578
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    sorry, I found a hole. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48386444
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    --- End diff --
    
    oh, sorry. I was looking at the wrong line. For full outer join, is it safe? Also, with the limit, the result of left/right outer may not be deterministic, right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10454#discussion_r48400022
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ---
    @@ -858,6 +859,30 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
     }
     
     /**
    +  * Push [[Limit]] operators through [[Join]] operators, iff the join type is outer joins.
    +  * Adding extra [[Limit]] operators on top of the outer-side child/children.
    +  */
    +object PushLimitThroughOuterJoin extends Rule[LogicalPlan] with PredicateHelper {
    +
    +  def apply(plan: LogicalPlan): LogicalPlan = plan transform {
    +    case f @ Limit(expr, Join(left, right, joinType, joinCondition)) =>
    +      joinType match {
    +        case RightOuter =>
    +          Limit(expr, Join(left, CombineLimits(Limit(expr, right)), joinType, joinCondition))
    --- End diff --
    
    That is right. However I think checking if it is already pushed would reduce unnecessary multiple applying this rule and `CombineLimits`. But the result should be the same. So it is minor.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12505] [SQL] Pushdown a Limit on top of...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10454#issuecomment-166993615
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48252/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org