You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (JIRA)" <ji...@apache.org> on 2019/04/16 13:07:00 UTC
[jira] [Created] (IMPALA-8423) Add rule to remove useless SELECT
node
Quanlong Huang created IMPALA-8423:
--------------------------------------
Summary: Add rule to remove useless SELECT node
Key: IMPALA-8423
URL: https://issues.apache.org/jira/browse/IMPALA-8423
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Quanlong Huang
We can add some rules to optimize the plan after we chose a cheapest plan based on cost. For example, one useful rule can be "removing useless SELECT nodes".
Impala will generated a useless SELECT for the following query:
{code:sql}
SELECT t.id, t.int_col
FROM functional.alltypestiny t
LEFT JOIN
(SELECT id, int_col
FROM functional.alltypestiny) t2
ON (t.id = t2.id)
WHERE t.int_col = t.id
UNION ALL
VALUES (NULL, NULL){code}
Its single node plan is
{code:java}
PLAN-ROOT SINK
|
00:UNION
| constant-operands=1
| row-size=8B cardinality=1
|
04:SELECT
| predicates: t.id = t.int_col
| row-size=12B cardinality=0
|
03:HASH JOIN [RIGHT OUTER JOIN]
| hash predicates: id = t.id
| runtime filters: RF000 <- t.id
| row-size=12B cardinality=1
|
|--01:SCAN HDFS [functional.alltypestiny t]
| HDFS partitions=4/4 files=4 size=460B
| predicates: t.int_col = t.id
| row-size=8B cardinality=1
|
02:SCAN HDFS [functional.alltypestiny]
HDFS partitions=4/4 files=4 size=460B
runtime filters: RF000 -> id
row-size=4B cardinality=8{code}
The SELECT node (id=04) is useless since its only predicate "t.id = t.int_col" has been enforced in the SCAN node (id=01) which is the right hand side of the RIGHT OUTER JOIN. The SELECT node won't filter out any more rows.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)