You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Baike Xia (Code Review)" <ge...@cloudera.org> on 2022/08/18 11:35:50 UTC

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION: Transforms: - Limit - Union - relation1 - relation2 .. Into: - Limit - Union - Limit - relation1 - Limit - relation2

Baike Xia has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18862


Change subject: IMPALA-11485: Pushdown LIMIT through UNION: Transforms:    - Limit       - Union          - relation1          - relation2          .. Into:    - Limit       - Union          - Limit             - relation1          - Limit             - relation2        
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
A testdata/workloads/tpch/queries/limit-pushdown-outer-join.test
A testdata/workloads/tpch/queries/limit-pushdown-union.test
10 files changed, 246 insertions(+), 27 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/1
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 1
Gerrit-Owner: Baike Xia <xi...@163.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
13 files changed, 281 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 3
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION ALL:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
13 files changed, 452 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 6
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18862/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18862/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@396
PS4, Line 396:   private static PlanNode findNodeCanLimitPushdown(PlanNode root) {
This function can only find the first `union all` or `join` node in the plan tree and push the `limit` down.

However, I think we should find all the descendant `union all` or `join` nodes and push the `limit` down. For example:

limit 10
  -> left outer join
    -> left outer join
      -> table scan t1
      -> table scan t2
    -> table scan t3

can be optimized to the following plan after applying limit pushdown:

limit 10
  -> left outer join
    -> limit 10
      -> left outer join
        -> limit 10
          -> table scan t1
        -> table scan t2
    -> table scan t3


http://gerrit.cloudera.org:8080/#/c/18862/4/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@397
PS4, Line 397:     if (root == null || root instanceof UnionNode || root instanceof HashJoinNode) {
Should we check whether it's `JoinNode` instead of `HashJoinNode` since this happens in the single-node planning phase, in which the physical plan is not determined?

I noticed there is another join implementation called `NestedLoopJoin`, it should also be covered by this optimization



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 4
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 13 Sep 2022 02:33:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8682/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 11 Oct 2022 04:05:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION: Transforms: - Limit - Union - relation1 - relation2 .. Into: - Limit - Union - Limit - relation1 - Limit - relation2

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION: Transforms:    - Limit       - Union          - relation1          - relation2          .. Into:    - Limit       - Union          - Limit             - relation1          - Limit             - relation2        
......................................................................


Patch Set 1:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11180/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 1
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Aug 2022 11:57:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 11 Oct 2022 09:16:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION: Transforms: - Limit - Union - relation1 - relation2 .. Into: - Limit - Union - Limit - relation1 - Limit - relation2

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION: Transforms:    - Limit       - Union          - relation1          - relation2          .. Into:    - Limit       - Union          - Limit             - relation1          - Limit             - relation2        
......................................................................


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8463/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 1
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Aug 2022 12:00:20 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8674/


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 16:44:01 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
13 files changed, 340 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11976/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 8
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 06 Dec 2022 11:26:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18862/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18862/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@412
PS5, Line 412:     if (root.getChildCount() > 0) {
             :       findNodeCanLimitPushdown(root.getChild(0), pushdownNodes);
             :     }
             :     if (root.getChildCount() > 1) {
             :       findNodeCanLimitPushdown(root.getChild(1), pushdownNodes);
             :     }
1. for the union node, it's better to go through all its children
2. for the join node, it's incorrect to go through the inner side. for example, `s right join t`, it's incorrect to find the node that can push down the limit in `s`. You can add the test to verify the result:

select *
from (
    select * from s1
    union all
    select * from s2
) tmp1
right join (
    select * from t1
    union all
    select * from t2
) tmp2
on tmp1.id=tmp2.id
limit 1;



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 11:31:53 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5:

(2 comments)

Started looking at the change, first from the aspect of correctness.

Thanks for working on this, great optimization!

http://gerrit.cloudera.org:8080/#/c/18862/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18862/5/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@368
PS5, Line 368: UNION
Shouldn't we push down LIMIT for UNION ALL only, and not for UNION? UNION also filters duplicates for children, so a child that returns 1000 rows may lead to only 1 row in the output.


http://gerrit.cloudera.org:8080/#/c/18862/5/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
File testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test:

http://gerrit.cloudera.org:8080/#/c/18862/5/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test@8
PS5, Line 8: limit 5;
Can you also add a test with an OFFSET clause?

I think that we should push down LIMIT + OFFSET as limit in that case, as simply pushing down limit can lead to returning too few rows from child nodes.



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 18:58:02 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
13 files changed, 281 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/4
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 4
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11601/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Tue, 11 Oct 2022 04:24:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8677/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 6
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 08:36:36 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18862/3/tests/query_test/test_limit_pushdown.py
File tests/query_test/test_limit_pushdown.py:

http://gerrit.cloudera.org:8080/#/c/18862/3/tests/query_test/test_limit_pushdown.py@22
PS3, Line 22: class TestLimitPushdown(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 3
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Aug 2022 02:28:18 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11203/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 4
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 22 Aug 2022 16:39:54 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11197/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 3
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Aug 2022 02:48:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18862/5/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
File testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test:

http://gerrit.cloudera.org:8080/#/c/18862/5/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test@8
PS5, Line 8: limit 5;
> Of course, i will add a test with an OFFSET clause. But, because OFFSET req
Oops, you are right, I was confused because of the second paragraph of https://impala.apache.org/docs/build/html/topics/impala_offset.html , so I thought that Impala supports offset without order by (which doesn't really make sense, as Impala has no ordering guarantees without order by)



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 6
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 16:13:18 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION ALL:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
14 files changed, 467 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5:

> Patch Set 4:
> 
> (2 comments)

Yes, you are right. I changed it.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 11:05:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11580/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 11:21:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/8674/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 5
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 11:41:51 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11587/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 6
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 08:56:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION: Transforms: - Limit - Union - relation1 - relation2 .. Into: - Limit - Union - Limit - relation1 - Limit - relation2

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION: Transforms:    - Limit       - Union          - relation1          - relation2          .. Into:    - Limit       - Union          - Limit             - relation1          - Limit             - relation2        
......................................................................


Patch Set 1: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8463/


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 1
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Aug 2022 16:54:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18862/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18862/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@370
PS7, Line 370:   private static void checkAndApplyLimitPushdown(PlanNode root,
Maybe a better implementation can be this:

void applyLimitPushdownOptimization(PlanNode cur, long limit, ...) {
    if (cur instance of SortNode
        || cur instance of AggregationNode
        || cur instance of SubplanNode
        || cur instance of AnalyticEvalNode) {
        // stop finding nodes can pushdown limit when we meet pipeline breakers
        return
    } else if (cur instanceof UnionNode) {
        // pushdown limit to union all node
        cur.setLimit(limit);
        cur.computeStats(analyzer);

        // try to apply limit pushdown to every child
        for (child : cur.ChildNodes()) {
            applyLimitPushdownOptimization(child, limit, ...)
        }
    } else if (cur instance of JoinNode) {
        // pushdown limit to the outer side of join node
        outerNode = cur.GetOuterNode();
        outerNode.setLimit(limit);
        outerNode.computeStats(analyzer);

        // try to apply limit pushdown only to the outer side child
        applyLimitPushdownOptimization(outerNode, limit, ...)
    } else {
        // try to apply limit pushdown to every child of non pipeline breaker
        for (child : cur.ChildNodes()) {
            applyLimitPushdownOptimization(child, limit, ...)
        }
    }
}


http://gerrit.cloudera.org:8080/#/c/18862/7/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@409
PS7, Line 409: root instanceof JoinNode
if we stop finding nodes can be pushdown in JoinNode, the following scenario cannot be optimized:

select *
from (
    select * from t1
    unioin all
    select * from t2
) t
left join s
on t.id=s.id
limit 1;

the expected optimized equivalent is:

select *
from (
    select * from t1 limit 1
    unioin all
    select * from t2 limit 1
    limit 1
) t
left join s
on t.id=s.id
limit 1;



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 12 Oct 2022 09:38:25 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/8677/


-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 6
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 13:46:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
File testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test:

http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test@2
PS7, Line 2: Base case
Can you add an even simpler test, e.g. with simpler join condition and no where clause?


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test@5
PS7, Line 5: testtbl
I think that it would be better to use a non-empty table like in functional.alltypes. Empty tables may be optimized out later, leading these tests to fail.


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test@8
PS7, Line 8: t1.zip + t2.zip = 10 and t1.zip + t2.zip = 20
This seems to be always false to me - one day the planner may be able to deduce this, so I would prefer a condition that can be true.


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test@27
PS7, Line 27:    limit: 5
I am not sure whether it was correct to push down limit in this case - the issue is with the predicates in WHERE clause that use columns from both t1 and t2. 

Let's assume that only a single row is returned from Node 01.
If in node 00 the first 5 rows pass a1.id>0, but will fail t1.zip + t2.zip=10 with the row from the right side, then the query will return 0 rows. Meanwhile it is possible that there was another row in the table that would satisfy all predicates.


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test@133
PS7, Line 133: |  |--05:SCAN HDFS [functional.testtbl]
             : |  |     HDFS partitions=1/1 files=0 size=0B
             : |  |     row-size=24B cardinality=0
             : |  |
             : |  04:SCAN HDFS [functional.testtbl]
             : |     HDFS partitions=1/1 files=0 size=0B
             : |     row-size=24B cardinality=0
Why didn't we push down the limit to these scan nodes?


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
File testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test:

http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test@2
PS7, Line 2: Base case.
Can you add 1-2 more complex tests? Some ideas:
- multiple UNION ALL in a query
- UNION ALL with a constant subquery like SELECT 1 (a potential optimization would be to subtract the constant rows from the LIMIT)
- UNION ALL with both predicates in the outer query and in the inner query


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/topn.test
File testdata/workloads/functional-planner/queries/PlannerTest/topn.test:

http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-planner/queries/PlannerTest/topn.test@464
PS7, Line 464: # test that top-n is not placed below an unpartitioned exchange with a limit
The optimization led to removing the exchange node, so this test no longer checks what it should (according to this comment). Maybe increasing the limit could lead to keeping the distributed plan?


http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/18862/7/testdata/workloads/functional-query/queries/QueryTest/joins.test@483
PS7, Line 483: order by a.int_col 
Why is this "order by" needed here?


http://gerrit.cloudera.org:8080/#/c/18862/7/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/18862/7/tests/query_test/test_observability.py@251
PS7, Line 251:     # There are 3 exchange nodes and each appears in the profile 2 times (for 1 fragment
             :     # instance + the averaged fragment).
             :     assert results.runtime_profile.count("EXCHANGE_NODE") == 8
The comment and the assert no longer match. Also, I am curious why was the number of exchange nodes increased.



-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 7
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Comment-Date: Wed, 19 Oct 2022 15:27:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Posted by "Baike Xia (Code Review)" <ge...@cloudera.org>.
Baike Xia has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/18862 )

Change subject: IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN
......................................................................

IMPALA-11485: Pushdown LIMIT through UNION ALL and LEFT/RIGHT OUTER JOIN

Pushdown LIMIT through UNION ALL:
Transforms:
   - Limit
      - Union
         - relation1
         - relation2
         ..
Into:
   - Limit
      - Union
         - Limit
            - relation1
         - Limit
            - relation2
         ..

Pushdown LIMIT through LEFT/RIGHT OUTER JOIN:
Transforms:
   - Limit
      - Join
         - left source
         - right source
Into:
   - Limit
      - Join
         - Limit (present if Join is left outer)
            - left source
         - Limit (present if Join is right outer)
            - right source

Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
---
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-outer-join.test
A testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-union.test
M testdata/workloads/functional-planner/queries/PlannerTest/topn.test
M testdata/workloads/functional-planner/queries/PlannerTest/union.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
A testdata/workloads/functional-query/queries/limit-pushdown-outer-join.test
A testdata/workloads/functional-query/queries/limit-pushdown-union.test
A tests/query_test/test_limit_pushdown.py
M tests/query_test/test_observability.py
14 files changed, 469 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/18862/8
-- 
To view, visit http://gerrit.cloudera.org:8080/18862
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia5d040c0a98e60639d7ce4b25ecf07a859c8a32c
Gerrit-Change-Number: 18862
Gerrit-PatchSet: 8
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Aman Sinha <am...@cloudera.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>