You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/05 20:02:00 UTC
[jira] [Work logged] (HIVE-26524) Use Calcite to remove sections of a query plan known never produces rows

     [ https://issues.apache.org/jira/browse/HIVE-26524?focusedWorklogId=814005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-814005 ]

ASF GitHub Bot logged work on HIVE-26524:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Oct/22 20:01
            Start Date: 05/Oct/22 20:01
    Worklog Time Spent: 10m 
      Work Description: kasakrisz commented on code in PR #3588:
URL: https://github.com/apache/hive/pull/3588#discussion_r985560069


##########
ql/src/test/results/clientpositive/llap/float_equality.q.out:
##########
@@ -9,9 +9,7 @@ POSTHOOK: Input: _dummy_database@_dummy_table
 1
 PREHOOK: query: select 1 where -0.0<0.0
 PREHOOK: type: QUERY
-PREHOOK: Input: _dummy_database@_dummy_table

Review Comment:
   This was the original plan: empty result query is converted to a sub-query hence the the limit is not treated as global limit and the query is executed. 
   ```
   HiveProject(_o__c0=[1])
     HiveSortLimit(fetch=[0])
       HiveProject(DUMMY=[0])
         HiveTableScan(table=[[_dummy_database, _dummy_table]], table:alias=[_dummy_table])
   ```
   These is a TS on `_dummy_table`
   ```
   STAGE PLANS:
     Stage: Stage-0
       Fetch Operator
         limit: -1
         Processor Tree:
           TableScan
             alias: _dummy_table
             Row Limit Per Split: 1
             Limit
               Number of rows: 0
               Select Operator
                 Select Operator
                   expressions: 1 (type: int)
                   outputColumnNames: _col0
                   ListSink
   ```
   The new plan is just an empty values op.
   ```
   HiveValues(tuples=[[]])
   ```
   There is no TS,
   ```
   STAGE PLANS:
     Stage: Stage-0
       Fetch Operator
         limit: 0
         Processor Tree:
           ListSink
   ```
   



##########
ql/src/test/results/clientpositive/llap/fold_case.q.out:
##########
@@ -177,7 +180,7 @@ STAGE PLANS:
                           sort order: 
                           Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
                           value expressions: _col0 (type: bigint)
-            Execution mode: vectorized, llap
+            Execution mode: llap

Review Comment:
   The vectorization was lost because instead of a real table the `_dummy_table` with empty schema is scanned. It affects only this mapper. Rest of the plan (Reducer) is still vectorized.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 814005)
    Time Spent: 5.5h  (was: 5h 20m)

> Use Calcite to remove sections of a query plan known never produces rows
> ------------------------------------------------------------------------
>
>                 Key: HIVE-26524
>                 URL: https://issues.apache.org/jira/browse/HIVE-26524
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Calcite has a set of rules to remove sections of a query plan known never produces any rows. In some cases the whole plan can be removed. Such plans are represented with a single {{Values}} operators with no tuples. ex.:
> {code:java}
> select y + 1 from (select a1 y, b1 z from t1 where b1 > 10) q WHERE 1=0
> {code}
> {code:java}
> HiveValues(tuples=[[]])
> {code}
> Other cases when plan has outer join or set operators some branches can be replaced with empty values moving forward in some cases the join/set operator can be removed
> {code:java}
> select a2, b2 from t2 where 1=0
> union
> select a1, b1 from t1
> {code}
> {code:java}
> HiveAggregate(group=[{0, 1}])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)