You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Yida Wu (Code Review)" <ge...@cloudera.org> on 2023/05/03 21:09:01 UTC
[Impala-ASF-CR] IMPALA-10861: Optimize the plan for identical predicates
Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/19511 )
Change subject: IMPALA-10861: Optimize the plan for identical predicates
......................................................................
Patch Set 3:
(2 comments)
Sorry for late feedback. Looks good to me, just one or two questions.
http://gerrit.cloudera.org:8080/#/c/19511/3/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:
http://gerrit.cloudera.org:8080/#/c/19511/3/fe/src/main/java/org/apache/impala/analysis/Expr.java@1265
PS3, Line 1265: for (C expr: origList) {
The time complexity can be O(n^2) in the worst case, because every conjuncts would need to call removeDuplicates() if my understanding is correct, do you think it necessary to optimize it?
http://gerrit.cloudera.org:8080/#/c/19511/3/testdata/workloads/functional-planner/queries/PlannerTest/joins.test
File testdata/workloads/functional-planner/queries/PlannerTest/joins.test:
http://gerrit.cloudera.org:8080/#/c/19511/3/testdata/workloads/functional-planner/queries/PlannerTest/joins.test@3122
PS3, Line 3122: ON c.c_custkey = l.l_orderkey and c.c_custkey = l.l_orderkey
Tried below query, the "other predicates" should remove the duplicates, but seems not as expected.
Query: explain SELECT c_custkey
from tpch.customer c
left outer join tpch.lineitem l
ON c.c_custkey = l.l_orderkey
where l.l_discount > c.c_acctbal and c.c_acctbal < l.l_discount
+-----------------------------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=30.50MB Threads=6 |
| Per-Host Resource Estimates: Memory=819MB |
| |
| PLAN-ROOT SINK |
| | |
| 05:EXCHANGE [UNPARTITIONED] |
| | |
| 02:HASH JOIN [RIGHT OUTER JOIN, PARTITIONED] |
| | hash predicates: l.l_orderkey = c.c_custkey |
| | other predicates: c.c_acctbal < l.l_discount, l.l_discount > c.c_acctbal |
| | runtime filters: RF000 <- c.c_custkey |
| | row-size=32B cardinality=575.77K |
| | |
| |--04:EXCHANGE [HASH(c.c_custkey)] |
| | | |
| | 00:SCAN HDFS [tpch.customer c] |
| | HDFS partitions=1/1 files=1 size=23.08MB |
| | row-size=16B cardinality=150.00K |
| | |
| 03:EXCHANGE [HASH(l.l_orderkey)] |
| | |
| 01:SCAN HDFS [tpch.lineitem l] |
| HDFS partitions=1/1 files=1 size=718.94MB |
| runtime filters: RF000 -> l.l_orderkey |
| row-size=16B cardinality=6.00M |
+-----------------------------------------------------------------------------+
--
To view, visit http://gerrit.cloudera.org:8080/19511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia249c8146215fad602e9310bf922c6bfa050b96b
Gerrit-Change-Number: 19511
Gerrit-PatchSet: 3
Gerrit-Owner: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Baike Xia <xi...@163.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wy...@gmail.com>
Gerrit-Comment-Date: Wed, 03 May 2023 21:09:01 +0000
Gerrit-HasComments: Yes