You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@impala.apache.org by "Quanlong Huang (JIRA)" <ji...@apache.org> on 2019/04/16 13:07:00 UTC

[jira] [Created] (IMPALA-8423) Add rule to remove useless SELECT node

Quanlong Huang created IMPALA-8423:
--------------------------------------

             Summary: Add rule to remove useless SELECT node
                 Key: IMPALA-8423
                 URL: https://issues.apache.org/jira/browse/IMPALA-8423
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Quanlong Huang


We can add some rules to optimize the plan after we chose a cheapest plan based on cost. For example, one useful rule can be "removing useless SELECT nodes".

Impala will generated a useless SELECT for the following query:
{code:sql}
SELECT t.id, t.int_col
FROM functional.alltypestiny t
LEFT JOIN
  (SELECT id, int_col
  FROM functional.alltypestiny) t2
ON (t.id = t2.id)
WHERE t.int_col = t.id
UNION ALL
VALUES (NULL, NULL){code}
Its single node plan is
{code:java}
PLAN-ROOT SINK
|
00:UNION
|  constant-operands=1
|  row-size=8B cardinality=1
|
04:SELECT
|  predicates: t.id = t.int_col
|  row-size=12B cardinality=0
|
03:HASH JOIN [RIGHT OUTER JOIN]
|  hash predicates: id = t.id
|  runtime filters: RF000 <- t.id
|  row-size=12B cardinality=1
|
|--01:SCAN HDFS [functional.alltypestiny t]
|     HDFS partitions=4/4 files=4 size=460B
|     predicates: t.int_col = t.id
|     row-size=8B cardinality=1
|
02:SCAN HDFS [functional.alltypestiny]
   HDFS partitions=4/4 files=4 size=460B
   runtime filters: RF000 -> id
   row-size=4B cardinality=8{code}
The SELECT node (id=04) is useless since its only predicate "t.id = t.int_col" has been enforced in the SCAN node (id=01) which is the right hand side of the RIGHT OUTER JOIN. The SELECT node won't filter out any more rows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)