You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2017/06/27 21:22:00 UTC
[jira] [Created] (HIVE-16976) DPP: SyntheticJoinPredicate
transitivity for < > and BETWEEN
Gopal V created HIVE-16976:
------------------------------
Summary: DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
Key: HIVE-16976
URL: https://issues.apache.org/jira/browse/HIVE-16976
Project: Hive
Issue Type: Improvement
Components: Tez
Affects Versions: 2.1.1, 3.0.0
Reporter: Gopal V
Tez DPP does not kick in for scenarios where a user wants to run a comparison clause instead of a JOIN/IN clause.
{code}
explain select count(1) from store_sales where ss_sold_date_sk > (select max(d_Date_sk) from date_dim where d_year = 2017);
Warning: Map Join MAPJOIN[21][bigTable=?] in task 'Map 1' is a cross product
OK
Plan optimized by CBO.
Vertex dependency in root stage
Map 1 <- Reducer 4 (BROADCAST_EDGE)
Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE)
Stage-0
Fetch Operator
limit:-1
Stage-1
Reducer 2 vectorized, llap
File Output Operator [FS_36]
Group By Operator [GBY_35] (rows=1 width=8)
Output:["_col0"],aggregations:["count(VALUE._col0)"]
<-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
PARTITION_ONLY_SHUFFLE [RS_34]
Group By Operator [GBY_33] (rows=1 width=8)
Output:["_col0"],aggregations:["count(1)"]
Select Operator [SEL_32] (rows=9600142089 width=16)
Filter Operator [FIL_31] (rows=9600142089 width=16)
predicate:(_col0 > _col1)
Map Join Operator [MAPJOIN_30] (rows=28800426268 width=16)
Conds:(Inner),Output:["_col0","_col1"]
<-Reducer 4 [BROADCAST_EDGE] vectorized, llap
BROADCAST [RS_28]
Group By Operator [GBY_27] (rows=1 width=8)
Output:["_col0"],aggregations:["max(VALUE._col0)"]
<-Map 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
PARTITION_ONLY_SHUFFLE [RS_26]
Group By Operator [GBY_25] (rows=1 width=8)
Output:["_col0"],aggregations:["max(d_date_sk)"]
Select Operator [SEL_24] (rows=652 width=12)
Output:["d_date_sk"]
Filter Operator [FIL_23] (rows=652 width=12)
predicate:(d_year = 2017)
TableScan [TS_2] (rows=73049 width=12)
tpcds_bin_partitioned_newschema_orc_10000@date_dim,date_dim,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk","d_year"]
<-Select Operator [SEL_29] (rows=28800426268 width=8)
Output:["_col0"]
TableScan [TS_0] (rows=28800426268 width=172)
tpcds_bin_partitioned_newschema_orc_10000@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE
{code}
The SyntheticJoinPredicate is only injected for equi joins, not for < or > scalar subqueries.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)