You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/05 20:02:00 UTC
[jira] [Work logged] (HIVE-26524) Use Calcite to remove sections of a query plan known never produces rows
[ https://issues.apache.org/jira/browse/HIVE-26524?focusedWorklogId=814005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-814005 ]
ASF GitHub Bot logged work on HIVE-26524:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 05/Oct/22 20:01
Start Date: 05/Oct/22 20:01
Worklog Time Spent: 10m
Work Description: kasakrisz commented on code in PR #3588:
URL: https://github.com/apache/hive/pull/3588#discussion_r985560069
##########
ql/src/test/results/clientpositive/llap/float_equality.q.out:
##########
@@ -9,9 +9,7 @@ POSTHOOK: Input: _dummy_database@_dummy_table
1
PREHOOK: query: select 1 where -0.0<0.0
PREHOOK: type: QUERY
-PREHOOK: Input: _dummy_database@_dummy_table
Review Comment:
This was the original plan: empty result query is converted to a sub-query hence the the limit is not treated as global limit and the query is executed.
```
HiveProject(_o__c0=[1])
HiveSortLimit(fetch=[0])
HiveProject(DUMMY=[0])
HiveTableScan(table=[[_dummy_database, _dummy_table]], table:alias=[_dummy_table])
```
These is a TS on `_dummy_table`
```
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
TableScan
alias: _dummy_table
Row Limit Per Split: 1
Limit
Number of rows: 0
Select Operator
Select Operator
expressions: 1 (type: int)
outputColumnNames: _col0
ListSink
```
The new plan is just an empty values op.
```
HiveValues(tuples=[[]])
```
There is no TS,
```
STAGE PLANS:
Stage: Stage-0
Fetch Operator
limit: 0
Processor Tree:
ListSink
```
##########
ql/src/test/results/clientpositive/llap/fold_case.q.out:
##########
@@ -177,7 +180,7 @@ STAGE PLANS:
sort order:
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: bigint)
- Execution mode: vectorized, llap
+ Execution mode: llap
Review Comment:
The vectorization was lost because instead of a real table the `_dummy_table` with empty schema is scanned. It affects only this mapper. Rest of the plan (Reducer) is still vectorized.
Issue Time Tracking
-------------------
Worklog Id: (was: 814005)
Time Spent: 5.5h (was: 5h 20m)
> Use Calcite to remove sections of a query plan known never produces rows
> ------------------------------------------------------------------------
>
> Key: HIVE-26524
> URL: https://issues.apache.org/jira/browse/HIVE-26524
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Reporter: Krisztian Kasa
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
> Time Spent: 5.5h
> Remaining Estimate: 0h
>
> Calcite has a set of rules to remove sections of a query plan known never produces any rows. In some cases the whole plan can be removed. Such plans are represented with a single {{Values}} operators with no tuples. ex.:
> {code:java}
> select y + 1 from (select a1 y, b1 z from t1 where b1 > 10) q WHERE 1=0
> {code}
> {code:java}
> HiveValues(tuples=[[]])
> {code}
> Other cases when plan has outer join or set operators some branches can be replaced with empty values moving forward in some cases the join/set operator can be removed
> {code:java}
> select a2, b2 from t2 where 1=0
> union
> select a1, b1 from t1
> {code}
> {code:java}
> HiveAggregate(group=[{0, 1}])
> HiveTableScan(table=[[default, t1]], table:alias=[t1])
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)