You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Jesus Camacho Rodriguez (JIRA)" <ji...@apache.org> on 2017/06/13 19:43:00 UTC
[jira] [Commented] (HIVE-16885) Non-equi Joins: Filter clauses
should be pushed into the ON clause
[ https://issues.apache.org/jira/browse/HIVE-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048301#comment-16048301 ]
Jesus Camacho Rodriguez commented on HIVE-16885:
------------------------------------------------
Initial patch to trigger QA.
> Non-equi Joins: Filter clauses should be pushed into the ON clause
> ------------------------------------------------------------------
>
> Key: HIVE-16885
> URL: https://issues.apache.org/jira/browse/HIVE-16885
> Project: Hive
> Issue Type: Improvement
> Components: Physical Optimizer
> Affects Versions: 3.0.0
> Reporter: Gopal V
> Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-16885.patch
>
>
> FIL_24 -> MAPJOIN_23
> {code}
> hive> explain select * from part where p_size > (select max(p_size) from part group by p_type);
> Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 3 (BROADCAST_EDGE)
> Reducer 3 <- Map 2 (SIMPLE_EDGE)
> Stage-0
> Fetch Operator
> limit:-1
> Stage-1
> Map 1 vectorized, llap
> File Output Operator [FS_26]
> Select Operator [SEL_25] (rows=11000000000 width=621)
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
> Filter Operator [FIL_24] (rows=11000000000 width=625)
> predicate:(_col5 > _col9)
> Map Join Operator [MAPJOIN_23] (rows=33000000000 width=625)
> Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9"]
> <-Reducer 3 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_21]
> Select Operator [SEL_20] (rows=165 width=4)
> Output:["_col0"]
> Group By Operator [GBY_19] (rows=165 width=109)
> Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0
> <-Map 2 [SIMPLE_EDGE] vectorized, llap
> SHUFFLE [RS_18]
> PartitionCols:_col0
> Group By Operator [GBY_17] (rows=14190 width=109)
> Output:["_col0","_col1"],aggregations:["max(p_size)"],keys:p_type
> Select Operator [SEL_16] (rows=200000000 width=109)
> Output:["p_type","p_size"]
> TableScan [TS_2] (rows=200000000 width=109)
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"]
> <-Select Operator [SEL_22] (rows=200000000 width=621)
> Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
> TableScan [TS_0] (rows=200000000 width=621)
> tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_partkey","p_name","p_mfgr","p_brand","p_type","p_size","p_container","p_retailprice","p_comment"]
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)