You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Jian Zhang (Code Review)" <ge...@cloudera.org> on 2022/07/07 02:51:48 UTC

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Jian Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18705


Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
13 files changed, 101 insertions(+), 19 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/2
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 2
Gerrit-Owner: Jian Zhang <zj...@gmail.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2135
PS8, Line 2135:  !=
Shouldn't we enable <= ? So if all primary key columns are included in the join condition, then we assume that there can't be duplicates. Adding additional join conditions won't really make a difference.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 8
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Sun, 09 Oct 2022 17:40:24 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18705/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/9/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2070
PS9, Line 2070:         && canEliminateInnerPlan(analyzer, inner, innerRef, eqJoinConjuncts, otherJoinConjuncts)) {
line too long (99 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 9
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 08:31:07 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/10938/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 2
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Jul 2022 03:12:55 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9035/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 11
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 11:05:11 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#5).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
13 files changed, 103 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/5
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 5
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 12:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12346/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 12
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Thu, 09 Feb 2023 16:13:32 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 11:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12337/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 11
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 11:15:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3178
PS6, Line 3178:   public Set<TupleDescriptor> materializeSlots(List<Expr> exprs, boolean areJoinOnConds) {
I think that it would be better to keep a  materializeSlots(List<Expr> exprs) overload that calls this function with false argument. This would reduce the size of this change.

Another solution would be to create another function that only deals with the join case.


http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@1
PS6, Line 1: select catalog_sales.cs_item_sk, catalog_sales.cs_order_number
Can you add a few more tests? I was thinking about the following cases:
- the column in the join condition from the left table also occurs in the select list
- a column from the right side is also used in the where clause (we should not eliminate the join in this case)
- the join happens in an inline view or with clause
- see what happens in case of right join (even if it is unoptimized)
- see what happens in case of full join (even if it is unoptimized)
- see what happens if there are more than 2 tables joined


Also, these queries could be also tested in an EE test to ensure that that the optimization does not change the results.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 6
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 28 Sep 2022 11:41:35 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, Xiang Yang, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#7).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
9 files changed, 289 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/7
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 7
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/18705/12/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/12/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@682
PS12, Line 682:       if (root != null) root = createSubplan(root, subplanRefs, false, analyzer);
Seems root should never be null here now and the setId call below would NPE If it was. Would probably be better to check above when root was initially assigned.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 12
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 05 Apr 2023 14:25:18 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#3).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
13 files changed, 104 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/3
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 3
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 6:

(2 comments)

Hi jian, thanks for your contribution!

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2117
PS6, Line 2117:     while (conjunctIter.hasNext()) {
since you didn't remove eqJoinConjuncts's element, you can use 'for (BinaryPredicate conjunct: eqJoinConjuncts) {...}' to iterate List.


http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2123
PS6, Line 2123:       SlotRef innerSlotRef = eq.getChild(1).unwrapSlotRef(true);
is the 2nd child must belong to the right table?



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 6
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 28 Sep 2022 11:02:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 8:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11571/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 8
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Sat, 08 Oct 2022 12:36:31 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 11:

(2 comments)

> Patch Set 9:
> 
> Please provide more context on the actual queries and table sizes you are targeting along with some performance measurements with and without this optimization.

Hi Kurt, take the 'tpcds. inventory' table as an example:
the performance measurements result:
https://issues.apache.org/jira/secure/attachment/13055252/performance_compare.txt

http://gerrit.cloudera.org:8080/#/c/18705/9/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/18705/9/common/thrift/Query.thrift@602
PS9, Line 602:   149: optional bool trust_pk_fk_constraints = false
> I think that this should be false by default, otherwise this is a potential
Done in patch 11


http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@1
PS6, Line 1: # Test Case Summary:
> Can you add some simple EE tests too? e.g. on of the planner tests in https
Done in patch 11.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 11
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 12:08:23 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 3:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11198/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 3
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 22 Aug 2022 06:48:25 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3178
PS6, Line 3178:   public Set<TupleDescriptor> materializeSlots(List<Expr> exprs, boolean areJoinOnConds) {
> I think that it would be better to keep a  materializeSlots(List<Expr> expr
Done. I used the first method to reduce the change size.


http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2117
PS6, Line 2117:     while (conjunctIter.hasNext()) {
> since you didn't remove eqJoinConjuncts's element, you can use 'for (Binary
Done


http://gerrit.cloudera.org:8080/#/c/18705/6/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2123
PS6, Line 2123:       SlotRef innerSlotRef = eq.getChild(1).unwrapSlotRef(true);
> is the 2nd child must belong to the right table?
Right, only left joins are optimized at present.


http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@1
PS6, Line 1: select catalog_sales.cs_item_sk, catalog_sales.cs_order_number
> Can you add a few more tests? I was thinking about the following cases:
Done. Some unoptimized cases can be optimized in the following changes, for example, right outer join, join on inline view, only distinct values of the outer table are used (no matter whether the inner join key is unique or not), etc.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 6
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Sat, 08 Oct 2022 12:14:58 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 8:

(4 comments)

Thanks for the changes!

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@600
PS8, Line 600: false
Can you rewrite this to use the single argument version? (same for AnalyticInfo.java)


http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@15
PS8, Line 15: 2.05M
It is interesting, why did the cardinality change?


http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@43
PS8, Line 43: not
According to the plan the right side was actually eliminated.


http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@59
PS8, Line 59:  Left outer join an inline view 
Can you add another one with an inline view where the whole JOIN is included in the view?

I think that this is an important use case of this optimization, as it seems more likely to use a view this way than to write a join where we don't use the right side at all.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 8
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Sat, 08 Oct 2022 14:24:37 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has uploaded a new patch set (#12) to the change originally created by Jian Zhang. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
A testdata/workloads/tpcds/queries/outer-join-elimination.test
M tests/query_test/test_join_queries.py
11 files changed, 536 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/12
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 12
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 6:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11398/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 6
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Sep 2022 12:07:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 10:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/12336/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 10
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 10:30:14 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18705/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/10/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2071
PS10, Line 2071:         && canEliminateInnerPlan(analyzer, inner, innerRef, eqJoinConjuncts, otherJoinConjuncts)) {
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/18705/10/tests/query_test/test_join_queries.py
File tests/query_test/test_join_queries.py:

http://gerrit.cloudera.org:8080/#/c/18705/10/tests/query_test/test_join_queries.py@89
PS10, Line 89: d
flake8: F811 redefinition of unused 'test_outer_joins' from line 83


http://gerrit.cloudera.org:8080/#/c/18705/10/tests/query_test/test_join_queries.py@121
PS10, Line 121: class TestJoinElimination(ImpalaTestSuite):
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/18705/10/tests/query_test/test_join_queries.py@131
PS10, Line 131: \
flake8: E502 the backslash is redundant between brackets



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 10
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 10:08:50 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2135
PS8, Line 2135:  !=
> Shouldn't we enable <= ? So if all primary key columns are included in the 
Sure. We should change it to >. The outer join should not be eliminated if the amount of primary key columns is greater than the amount of join's equal conditions.

A test is also added for this case.


http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@43
PS8, Line 43: not
> According to the plan the right side was actually eliminated.
It's a bug caused by mistakenly validating whether the slots are only used in join conjuncts. Slots in equal and other conjuncts should update their `isOnlyInJoinConds_` to true before validating whether the outer join can be eliminated.

I've fixed it in the newest patch.


http://gerrit.cloudera.org:8080/#/c/18705/8/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@59
PS8, Line 59:  Left outer join an inline view 
> Can you add another one with an inline view where the whole JOIN is include
Done. But since there is no column pruning before this optimization, the outer join in the inline view is not eliminated now. I'll record this as a further optimization in Jira.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 8
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 08:38:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Csaba Ringhofer (Code Review)" <ge...@cloudera.org>.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/18705/9/common/thrift/Query.thrift
File common/thrift/Query.thrift:

http://gerrit.cloudera.org:8080/#/c/18705/9/common/thrift/Query.thrift@602
PS9, Line 602:   149: optional bool trust_pk_fk_constraints = true;
I think that this should be false by default, otherwise this is a potential breaking change as FK/PK constraints are not enforced. Users who trust their constraints can turn it on by default using flag default_query_options.


http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
File fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java:

http://gerrit.cloudera.org:8080/#/c/18705/8/fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java@600
PS8, Line 600: 
> Can you rewrite this to use the single argument version? (same for Analytic
Can you do this rewrite to reduce the footprint of the change?


http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
File testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test:

http://gerrit.cloudera.org:8080/#/c/18705/6/testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test@1
PS6, Line 1: # Test Case Summary:
> Done. Some unoptimized cases can be optimized in the following changes, for
Can you add some simple EE tests too? e.g. on of the planner tests in https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/outer-joins.test with both enabling and disabling the query option.

Large results could be avoided by adding order by + limit or an aggregation on the result.



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 9
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Tue, 08 Nov 2022 14:06:19 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 12: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 12
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Thu, 09 Feb 2023 21:01:58 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 5:

Build Failed 

https://jenkins.impala.io/job/gerrit-code-review-checks/11397/ : Initial code review checks failed. See linked job for details on the failure.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 5
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 21 Sep 2022 06:26:42 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#6).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
16 files changed, 127 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/6
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 6
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@145
PS2, Line 145:   public boolean isOnlyInJoinConds() { return isOnlyInJoinConds_ == null ? true : isOnlyInJoinConds_; }
line too long (103 > 90)


http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2030
PS2, Line 2030:       boolean canEliminate = canEliminateInnerPlan(analyzer, inner, innerRef, eqJoinConjuncts);
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2105
PS2, Line 2105:    private boolean canEliminateInnerPlan(Analyzer analyzer, PlanNode innerPlan, TableRef innerRef, List<BinaryPredicate> eqJoinConjuncts) {
line too long (139 > 90)


http://gerrit.cloudera.org:8080/#/c/18705/2/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java@2132
PS2, Line 2132:     List<SQLPrimaryKey> pks = innerRef.getResolvedPath().destTable().getSqlConstraints().getPrimaryKeys();
line too long (106 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 2
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 07 Jul 2022 02:52:43 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Kurt Deschler (Code Review)" <ge...@cloudera.org>.
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 9:

Please provide more context on the actual queries and table sizes you are targeting along with some performance measurements with and without this optimization.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 9
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Tue, 08 Nov 2022 14:12:26 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, Xiang Yang, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#8).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
9 files changed, 287 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/8
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 8
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 7:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11569/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 7
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Sat, 08 Oct 2022 12:06:12 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Jian Zhang (Code Review)" <ge...@cloudera.org>.
Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins, Xiang Yang, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18705

to look at the new patch set (#9).

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
9 files changed, 419 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/9
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 9
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 9:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/11586/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 9
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Mon, 10 Oct 2022 08:50:59 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9041/ DRY_RUN=true


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 12
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Thu, 09 Feb 2023 15:54:34 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................


Patch Set 11: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 11
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>
Gerrit-Comment-Date: Wed, 08 Feb 2023 16:15:08 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has uploaded a new patch set (#10) to the change originally created by Jian Zhang. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
A testdata/workloads/tpcds/queries/outer-join-elimination.test
M tests/query_test/test_join_queries.py
11 files changed, 539 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/10
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 10
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>

[Impala-ASF-CR] IMPALA-11417: Support outer join elimination optimization

Posted by "Xiang Yang (Code Review)" <ge...@cloudera.org>.
Xiang Yang has uploaded a new patch set (#11) to the change originally created by Jian Zhang. ( http://gerrit.cloudera.org:8080/18705 )

Change subject: IMPALA-11417: Support outer join elimination optimization
......................................................................

IMPALA-11417: Support outer join elimination optimization

When two tables are outer joined but only fields from the outer side
table are used and the join key of the inner side table is guaranteed to
be unique, the query can be simplified to only scan the outer table:

    drop table if exists t;
    drop table if exists s;
    create table t(sid bigint, value bigint);
    create table s(id bigint, value bigint, primary key(id));

    -- the test SQL:
    select t.* from t left join s on t.sid = s.id;

The above query can be simplified to:

    select t.* from t;

This optimization utilizes the primary key constraint when creating join
nodes, eliminates the inner side when the join key on inner side is the
primary key and only the slots from the outer side are used by the
parent.

Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Signed-off-by: Jian Zhang <zj...@gmail.com>
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/outer-join-elimination.test
A testdata/workloads/tpcds/queries/outer-join-elimination.test
M tests/query_test/test_join_queries.py
11 files changed, 535 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/18705/11
-- 
To view, visit http://gerrit.cloudera.org:8080/18705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2e68263a029ac84a4f35b0846b22aa42d7ceece
Gerrit-Change-Number: 18705
Gerrit-PatchSet: 11
Gerrit-Owner: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <cs...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Jian Zhang <zj...@gmail.com>
Gerrit-Reviewer: Kurt Deschler <kd...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>
Gerrit-Reviewer: Xiang Yang <yx...@126.com>