You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by se...@apache.org on 2018/07/18 18:52:01 UTC
[05/48] hive git commit: HIVE-20090 : Extend creation of semijoin
reduction filters to be able to discover new opportunities (Jesus Camacho
Rodriguez via Deepak Jaiswal)
http://git-wip-us.apache.org/repos/asf/hive/blob/ab9e954d/ql/src/test/results/clientpositive/perf/tez/query94.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/tez/query94.q.out b/ql/src/test/results/clientpositive/perf/tez/query94.q.out
index 5d19a16..396be11 100644
--- a/ql/src/test/results/clientpositive/perf/tez/query94.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/query94.q.out
@@ -76,22 +76,22 @@ Stage-0
limit:-1
Stage-1
Reducer 9 vectorized
- File Output Operator [FS_174]
- Limit [LIM_173] (rows=1 width=344)
+ File Output Operator [FS_176]
+ Limit [LIM_175] (rows=1 width=344)
Number of rows:100
- Select Operator [SEL_172] (rows=1 width=344)
+ Select Operator [SEL_174] (rows=1 width=344)
Output:["_col0","_col1","_col2"]
<-Reducer 8 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_171]
- Select Operator [SEL_170] (rows=1 width=344)
+ SHUFFLE [RS_173]
+ Select Operator [SEL_172] (rows=1 width=344)
Output:["_col1","_col2","_col3"]
- Group By Operator [GBY_169] (rows=1 width=344)
+ Group By Operator [GBY_171] (rows=1 width=344)
Output:["_col0","_col1","_col2"],aggregations:["count(VALUE._col0)","sum(VALUE._col1)","sum(VALUE._col2)"]
<-Reducer 7 [CUSTOM_SIMPLE_EDGE] vectorized
- PARTITION_ONLY_SHUFFLE [RS_168]
- Group By Operator [GBY_167] (rows=1 width=344)
+ PARTITION_ONLY_SHUFFLE [RS_170]
+ Group By Operator [GBY_169] (rows=1 width=344)
Output:["_col0","_col1","_col2"],aggregations:["count(_col0)","sum(_col1)","sum(_col2)"]
- Group By Operator [GBY_166] (rows=115958879 width=135)
+ Group By Operator [GBY_168] (rows=115958879 width=135)
Output:["_col0","_col1","_col2"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)"],keys:KEY._col0
<-Reducer 6 [SIMPLE_EDGE]
SHUFFLE [RS_74]
@@ -102,21 +102,21 @@ Stage-0
Output:["_col4","_col5","_col6"]
Filter Operator [FIL_41] (rows=115958879 width=135)
predicate:_col14 is null
- Merge Join Operator [MERGEJOIN_128] (rows=231917759 width=135)
- Conds:RS_38._col4=RS_165._col0(Left Outer),Output:["_col4","_col5","_col6","_col14"]
+ Merge Join Operator [MERGEJOIN_130] (rows=231917759 width=135)
+ Conds:RS_38._col4=RS_167._col0(Left Outer),Output:["_col4","_col5","_col6","_col14"]
<-Reducer 18 [ONE_TO_ONE_EDGE] vectorized
- FORWARD [RS_165]
+ FORWARD [RS_167]
PartitionCols:_col0
- Select Operator [SEL_164] (rows=7199233 width=92)
+ Select Operator [SEL_166] (rows=7199233 width=92)
Output:["_col0","_col1"]
- Group By Operator [GBY_163] (rows=7199233 width=92)
+ Group By Operator [GBY_165] (rows=7199233 width=92)
Output:["_col0"],keys:KEY._col0
<-Map 17 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_162]
+ SHUFFLE [RS_164]
PartitionCols:_col0
- Group By Operator [GBY_161] (rows=14398467 width=92)
+ Group By Operator [GBY_163] (rows=14398467 width=92)
Output:["_col0"],keys:wr_order_number
- Filter Operator [FIL_160] (rows=14398467 width=92)
+ Filter Operator [FIL_162] (rows=14398467 width=92)
predicate:wr_order_number is not null
TableScan [TS_25] (rows=14398467 width=92)
default@web_returns,wr1,Tbl:COMPLETE,Col:NONE,Output:["wr_order_number"]
@@ -125,101 +125,101 @@ Stage-0
PartitionCols:_col4
Select Operator [SEL_37] (rows=210834322 width=135)
Output:["_col4","_col5","_col6"]
- Merge Join Operator [MERGEJOIN_127] (rows=210834322 width=135)
- Conds:RS_34._col4=RS_159._col0(Left Semi),Output:["_col3","_col4","_col5","_col6","_col14"],residual filter predicates:{(_col3 <> _col14)}
+ Merge Join Operator [MERGEJOIN_129] (rows=210834322 width=135)
+ Conds:RS_34._col4=RS_161._col0(Left Semi),Output:["_col3","_col4","_col5","_col6","_col14"],residual filter predicates:{(_col3 <> _col14)}
<-Map 16 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_159]
+ SHUFFLE [RS_161]
PartitionCols:_col0
- Group By Operator [GBY_158] (rows=144002668 width=135)
+ Group By Operator [GBY_160] (rows=144002668 width=135)
Output:["_col0","_col1"],keys:_col0, _col1
- Select Operator [SEL_157] (rows=144002668 width=135)
+ Select Operator [SEL_159] (rows=144002668 width=135)
Output:["_col0","_col1"]
- Filter Operator [FIL_156] (rows=144002668 width=135)
+ Filter Operator [FIL_158] (rows=144002668 width=135)
predicate:(ws_order_number is not null and ws_warehouse_sk is not null)
TableScan [TS_22] (rows=144002668 width=135)
default@web_sales,ws2,Tbl:COMPLETE,Col:NONE,Output:["ws_warehouse_sk","ws_order_number"]
<-Reducer 4 [SIMPLE_EDGE]
SHUFFLE [RS_34]
PartitionCols:_col4
- Merge Join Operator [MERGEJOIN_126] (rows=191667562 width=135)
- Conds:RS_18._col2=RS_147._col0(Inner),Output:["_col3","_col4","_col5","_col6"]
+ Merge Join Operator [MERGEJOIN_128] (rows=191667562 width=135)
+ Conds:RS_18._col2=RS_149._col0(Inner),Output:["_col3","_col4","_col5","_col6"]
<-Map 14 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_147]
+ SHUFFLE [RS_149]
PartitionCols:_col0
- Select Operator [SEL_146] (rows=42 width=1850)
+ Select Operator [SEL_148] (rows=42 width=1850)
Output:["_col0"]
- Filter Operator [FIL_145] (rows=42 width=1850)
+ Filter Operator [FIL_147] (rows=42 width=1850)
predicate:((web_company_name = 'pri') and web_site_sk is not null)
TableScan [TS_9] (rows=84 width=1850)
default@web_site,web_site,Tbl:COMPLETE,Col:NONE,Output:["web_site_sk","web_company_name"]
<-Reducer 3 [SIMPLE_EDGE]
SHUFFLE [RS_18]
PartitionCols:_col2
- Merge Join Operator [MERGEJOIN_125] (rows=174243235 width=135)
- Conds:RS_15._col1=RS_139._col0(Inner),Output:["_col2","_col3","_col4","_col5","_col6"]
+ Merge Join Operator [MERGEJOIN_127] (rows=174243235 width=135)
+ Conds:RS_15._col1=RS_141._col0(Inner),Output:["_col2","_col3","_col4","_col5","_col6"]
<-Map 12 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_139]
+ SHUFFLE [RS_141]
PartitionCols:_col0
- Select Operator [SEL_138] (rows=20000000 width=1014)
+ Select Operator [SEL_140] (rows=20000000 width=1014)
Output:["_col0"]
- Filter Operator [FIL_137] (rows=20000000 width=1014)
+ Filter Operator [FIL_139] (rows=20000000 width=1014)
predicate:((ca_state = 'TX') and ca_address_sk is not null)
TableScan [TS_6] (rows=40000000 width=1014)
default@customer_address,customer_address,Tbl:COMPLETE,Col:NONE,Output:["ca_address_sk","ca_state"]
<-Reducer 2 [SIMPLE_EDGE]
SHUFFLE [RS_15]
PartitionCols:_col1
- Merge Join Operator [MERGEJOIN_124] (rows=158402938 width=135)
- Conds:RS_155._col0=RS_131._col0(Inner),Output:["_col1","_col2","_col3","_col4","_col5","_col6"]
+ Merge Join Operator [MERGEJOIN_126] (rows=158402938 width=135)
+ Conds:RS_157._col0=RS_133._col0(Inner),Output:["_col1","_col2","_col3","_col4","_col5","_col6"]
<-Map 10 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_131]
+ SHUFFLE [RS_133]
PartitionCols:_col0
- Select Operator [SEL_130] (rows=8116 width=1119)
+ Select Operator [SEL_132] (rows=8116 width=1119)
Output:["_col0"]
- Filter Operator [FIL_129] (rows=8116 width=1119)
+ Filter Operator [FIL_131] (rows=8116 width=1119)
predicate:(CAST( d_date AS TIMESTAMP) BETWEEN TIMESTAMP'1999-05-01 00:00:00' AND TIMESTAMP'1999-06-30 00:00:00' and d_date_sk is not null)
TableScan [TS_3] (rows=73049 width=1119)
default@date_dim,date_dim,Tbl:COMPLETE,Col:NONE,Output:["d_date_sk","d_date"]
<-Map 1 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_155]
+ SHUFFLE [RS_157]
PartitionCols:_col0
- Select Operator [SEL_154] (rows=144002668 width=135)
+ Select Operator [SEL_156] (rows=144002668 width=135)
Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6"]
- Filter Operator [FIL_153] (rows=144002668 width=135)
+ Filter Operator [FIL_155] (rows=144002668 width=135)
predicate:((ws_ship_addr_sk BETWEEN DynamicValue(RS_16_customer_address_ca_address_sk_min) AND DynamicValue(RS_16_customer_address_ca_address_sk_max) and in_bloom_filter(ws_ship_addr_sk, DynamicValue(RS_16_customer_address_ca_address_sk_bloom_filter))) and (ws_ship_date_sk BETWEEN DynamicValue(RS_13_date_dim_d_date_sk_min) AND DynamicValue(RS_13_date_dim_d_date_sk_max) and in_bloom_filter(ws_ship_date_sk, DynamicValue(RS_13_date_dim_d_date_sk_bloom_filter))) and (ws_web_site_sk BETWEEN DynamicValue(RS_19_web_site_web_site_sk_min) AND DynamicValue(RS_19_web_site_web_site_sk_max) and in_bloom_filter(ws_web_site_sk, DynamicValue(RS_19_web_site_web_site_sk_bloom_filter))) and ws_order_number is not null and ws_ship_addr_sk is not null and ws_ship_date_sk is not null and ws_web_site_sk is not null)
TableScan [TS_0] (rows=144002668 width=135)
default@web_sales,ws1,Tbl:COMPLETE,Col:NONE,Output:["ws_ship_date_sk","ws_ship_addr_sk","ws_web_site_sk","ws_warehouse_sk","ws_order_number","ws_ext_ship_cost","ws_net_profit"]
<-Reducer 11 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_136]
- Group By Operator [GBY_135] (rows=1 width=12)
+ BROADCAST [RS_138]
+ Group By Operator [GBY_137] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=1000000)"]
<-Map 10 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_134]
- Group By Operator [GBY_133] (rows=1 width=12)
+ SHUFFLE [RS_136]
+ Group By Operator [GBY_135] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=1000000)"]
- Select Operator [SEL_132] (rows=8116 width=1119)
+ Select Operator [SEL_134] (rows=8116 width=1119)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_130]
+ Please refer to the previous Select Operator [SEL_132]
<-Reducer 13 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_144]
- Group By Operator [GBY_143] (rows=1 width=12)
+ BROADCAST [RS_146]
+ Group By Operator [GBY_145] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=20000000)"]
<-Map 12 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_142]
- Group By Operator [GBY_141] (rows=1 width=12)
+ SHUFFLE [RS_144]
+ Group By Operator [GBY_143] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=20000000)"]
- Select Operator [SEL_140] (rows=20000000 width=1014)
+ Select Operator [SEL_142] (rows=20000000 width=1014)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_138]
+ Please refer to the previous Select Operator [SEL_140]
<-Reducer 15 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_152]
- Group By Operator [GBY_151] (rows=1 width=12)
+ BROADCAST [RS_154]
+ Group By Operator [GBY_153] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=1000000)"]
<-Map 14 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_150]
- Group By Operator [GBY_149] (rows=1 width=12)
+ SHUFFLE [RS_152]
+ Group By Operator [GBY_151] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=1000000)"]
- Select Operator [SEL_148] (rows=42 width=1850)
+ Select Operator [SEL_150] (rows=42 width=1850)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_146]
+ Please refer to the previous Select Operator [SEL_148]
http://git-wip-us.apache.org/repos/asf/hive/blob/ab9e954d/ql/src/test/results/clientpositive/perf/tez/query95.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/tez/query95.q.out b/ql/src/test/results/clientpositive/perf/tez/query95.q.out
index 400cc19..3a8ed09 100644
--- a/ql/src/test/results/clientpositive/perf/tez/query95.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/query95.q.out
@@ -63,22 +63,22 @@ POSTHOOK: type: QUERY
Plan optimized by CBO.
Vertex dependency in root stage
-Map 1 <- Reducer 10 (BROADCAST_EDGE), Reducer 12 (BROADCAST_EDGE), Reducer 14 (BROADCAST_EDGE)
-Map 19 <- Reducer 25 (BROADCAST_EDGE)
-Map 23 <- Reducer 25 (BROADCAST_EDGE)
+Map 1 <- Reducer 10 (BROADCAST_EDGE), Reducer 12 (BROADCAST_EDGE), Reducer 14 (BROADCAST_EDGE), Reducer 23 (BROADCAST_EDGE)
+Map 15 <- Reducer 23 (BROADCAST_EDGE)
+Map 21 <- Reducer 23 (BROADCAST_EDGE)
Reducer 10 <- Map 9 (CUSTOM_SIMPLE_EDGE)
Reducer 12 <- Map 11 (CUSTOM_SIMPLE_EDGE)
Reducer 14 <- Map 13 (CUSTOM_SIMPLE_EDGE)
-Reducer 16 <- Map 15 (SIMPLE_EDGE), Map 18 (SIMPLE_EDGE)
-Reducer 17 <- Reducer 16 (SIMPLE_EDGE)
+Reducer 16 <- Map 15 (SIMPLE_EDGE), Map 21 (SIMPLE_EDGE)
+Reducer 17 <- Map 22 (SIMPLE_EDGE), Reducer 16 (ONE_TO_ONE_EDGE)
+Reducer 18 <- Reducer 17 (SIMPLE_EDGE)
+Reducer 19 <- Map 15 (SIMPLE_EDGE), Map 21 (SIMPLE_EDGE)
Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 9 (SIMPLE_EDGE)
-Reducer 20 <- Map 19 (SIMPLE_EDGE), Map 23 (SIMPLE_EDGE)
-Reducer 21 <- Map 24 (SIMPLE_EDGE), Reducer 20 (ONE_TO_ONE_EDGE)
-Reducer 22 <- Reducer 21 (SIMPLE_EDGE)
-Reducer 25 <- Map 24 (CUSTOM_SIMPLE_EDGE)
+Reducer 20 <- Reducer 19 (SIMPLE_EDGE)
+Reducer 23 <- Map 22 (CUSTOM_SIMPLE_EDGE)
Reducer 3 <- Map 11 (SIMPLE_EDGE), Reducer 2 (SIMPLE_EDGE)
Reducer 4 <- Map 13 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
-Reducer 5 <- Reducer 17 (ONE_TO_ONE_EDGE), Reducer 22 (ONE_TO_ONE_EDGE), Reducer 4 (SIMPLE_EDGE)
+Reducer 5 <- Reducer 18 (ONE_TO_ONE_EDGE), Reducer 20 (ONE_TO_ONE_EDGE), Reducer 4 (SIMPLE_EDGE)
Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
Reducer 7 <- Reducer 6 (CUSTOM_SIMPLE_EDGE)
Reducer 8 <- Reducer 7 (SIMPLE_EDGE)
@@ -88,208 +88,201 @@ Stage-0
limit:-1
Stage-1
Reducer 8 vectorized
- File Output Operator [FS_273]
- Limit [LIM_272] (rows=1 width=344)
+ File Output Operator [FS_286]
+ Limit [LIM_285] (rows=1 width=344)
Number of rows:100
- Select Operator [SEL_271] (rows=1 width=344)
+ Select Operator [SEL_284] (rows=1 width=344)
Output:["_col0","_col1","_col2"]
<-Reducer 7 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_270]
- Select Operator [SEL_269] (rows=1 width=344)
+ SHUFFLE [RS_283]
+ Select Operator [SEL_282] (rows=1 width=344)
Output:["_col1","_col2","_col3"]
- Group By Operator [GBY_268] (rows=1 width=344)
+ Group By Operator [GBY_281] (rows=1 width=344)
Output:["_col0","_col1","_col2"],aggregations:["count(VALUE._col0)","sum(VALUE._col1)","sum(VALUE._col2)"]
<-Reducer 6 [CUSTOM_SIMPLE_EDGE] vectorized
- PARTITION_ONLY_SHUFFLE [RS_267]
- Group By Operator [GBY_266] (rows=1 width=344)
+ PARTITION_ONLY_SHUFFLE [RS_280]
+ Group By Operator [GBY_279] (rows=1 width=344)
Output:["_col0","_col1","_col2"],aggregations:["count(_col0)","sum(_col1)","sum(_col2)"]
- Group By Operator [GBY_265] (rows=421668645 width=135)
+ Group By Operator [GBY_278] (rows=421668645 width=135)
Output:["_col0","_col1","_col2"],aggregations:["sum(VALUE._col0)","sum(VALUE._col1)"],keys:KEY._col0
<-Reducer 5 [SIMPLE_EDGE]
SHUFFLE [RS_116]
PartitionCols:_col0
Group By Operator [GBY_115] (rows=421668645 width=135)
Output:["_col0","_col2","_col3"],aggregations:["sum(_col4)","sum(_col5)"],keys:_col3
- Merge Join Operator [MERGEJOIN_212] (rows=421668645 width=135)
- Conds:RS_58._col3=RS_247._col0(Inner),RS_58._col3=RS_264._col0(Inner),Output:["_col3","_col4","_col5"]
- <-Reducer 17 [ONE_TO_ONE_EDGE] vectorized
- FORWARD [RS_247]
+ Merge Join Operator [MERGEJOIN_228] (rows=421668645 width=135)
+ Conds:RS_58._col3=RS_277._col0(Inner),RS_58._col3=RS_275._col0(Inner),Output:["_col3","_col4","_col5"]
+ <-Reducer 18 [ONE_TO_ONE_EDGE] vectorized
+ FORWARD [RS_275]
PartitionCols:_col0
- Group By Operator [GBY_246] (rows=79201469 width=135)
+ Group By Operator [GBY_274] (rows=87121617 width=135)
Output:["_col0"],keys:KEY._col0
- <-Reducer 16 [SIMPLE_EDGE]
- SHUFFLE [RS_24]
- PartitionCols:_col0
- Group By Operator [GBY_23] (rows=158402938 width=135)
- Output:["_col0"],keys:_col1
- Select Operator [SEL_22] (rows=158402938 width=135)
- Output:["_col1"]
- Filter Operator [FIL_21] (rows=158402938 width=135)
- predicate:(_col0 <> _col2)
- Merge Join Operator [MERGEJOIN_209] (rows=158402938 width=135)
- Conds:RS_242._col1=RS_245._col1(Inner),Output:["_col0","_col1","_col2"]
- <-Map 15 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_242]
- PartitionCols:_col1
- Select Operator [SEL_241] (rows=144002668 width=135)
- Output:["_col0","_col1"]
- Filter Operator [FIL_240] (rows=144002668 width=135)
- predicate:ws_order_number is not null
- TableScan [TS_12] (rows=144002668 width=135)
- default@web_sales,ws1,Tbl:COMPLETE,Col:NONE,Output:["ws_warehouse_sk","ws_order_number"]
- <-Map 18 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_245]
- PartitionCols:_col1
- Select Operator [SEL_244] (rows=144002668 width=135)
- Output:["_col0","_col1"]
- Filter Operator [FIL_243] (rows=144002668 width=135)
- predicate:ws_order_number is not null
- TableScan [TS_15] (rows=144002668 width=135)
- default@web_sales,ws2,Tbl:COMPLETE,Col:NONE,Output:["ws_warehouse_sk","ws_order_number"]
- <-Reducer 22 [ONE_TO_ONE_EDGE] vectorized
- FORWARD [RS_264]
- PartitionCols:_col0
- Group By Operator [GBY_263] (rows=87121617 width=135)
- Output:["_col0"],keys:KEY._col0
- <-Reducer 21 [SIMPLE_EDGE]
+ <-Reducer 17 [SIMPLE_EDGE]
SHUFFLE [RS_46]
PartitionCols:_col0
Group By Operator [GBY_45] (rows=174243235 width=135)
Output:["_col0"],keys:_col1
- Merge Join Operator [MERGEJOIN_211] (rows=174243235 width=135)
- Conds:RS_41._col0=RS_250._col0(Inner),Output:["_col1"]
- <-Map 24 [SIMPLE_EDGE] vectorized
- PARTITION_ONLY_SHUFFLE [RS_250]
+ Merge Join Operator [MERGEJOIN_227] (rows=174243235 width=135)
+ Conds:RS_41._col0=RS_255._col0(Inner),Output:["_col1"]
+ <-Map 22 [SIMPLE_EDGE] vectorized
+ PARTITION_ONLY_SHUFFLE [RS_255]
PartitionCols:_col0
- Select Operator [SEL_249] (rows=14398467 width=92)
+ Select Operator [SEL_254] (rows=14398467 width=92)
Output:["_col0"]
- Filter Operator [FIL_248] (rows=14398467 width=92)
+ Filter Operator [FIL_253] (rows=14398467 width=92)
predicate:wr_order_number is not null
TableScan [TS_38] (rows=14398467 width=92)
default@web_returns,web_returns,Tbl:COMPLETE,Col:NONE,Output:["wr_order_number"]
- <-Reducer 20 [ONE_TO_ONE_EDGE]
+ <-Reducer 16 [ONE_TO_ONE_EDGE]
FORWARD [RS_41]
PartitionCols:_col0
Select Operator [SEL_37] (rows=158402938 width=135)
Output:["_col0"]
Filter Operator [FIL_36] (rows=158402938 width=135)
predicate:(_col0 <> _col2)
- Merge Join Operator [MERGEJOIN_210] (rows=158402938 width=135)
- Conds:RS_259._col1=RS_262._col1(Inner),Output:["_col0","_col1","_col2"]
- <-Map 19 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_259]
+ Merge Join Operator [MERGEJOIN_226] (rows=158402938 width=135)
+ Conds:RS_268._col1=RS_272._col1(Inner),Output:["_col0","_col1","_col2"]
+ <-Map 15 [SIMPLE_EDGE] vectorized
+ SHUFFLE [RS_268]
PartitionCols:_col1
- Select Operator [SEL_258] (rows=144002668 width=135)
+ Select Operator [SEL_267] (rows=144002668 width=135)
Output:["_col0","_col1"]
- Filter Operator [FIL_257] (rows=144002668 width=135)
+ Filter Operator [FIL_266] (rows=144002668 width=135)
predicate:((ws_order_number BETWEEN DynamicValue(RS_42_web_returns_wr_order_number_min) AND DynamicValue(RS_42_web_returns_wr_order_number_max) and in_bloom_filter(ws_order_number, DynamicValue(RS_42_web_returns_wr_order_number_bloom_filter))) and ws_order_number is not null)
TableScan [TS_27] (rows=144002668 width=135)
default@web_sales,ws1,Tbl:COMPLETE,Col:NONE,Output:["ws_warehouse_sk","ws_order_number"]
- <-Reducer 25 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_255]
- Group By Operator [GBY_254] (rows=1 width=12)
+ <-Reducer 23 [BROADCAST_EDGE] vectorized
+ BROADCAST [RS_261]
+ Group By Operator [GBY_259] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=14398467)"]
- <-Map 24 [CUSTOM_SIMPLE_EDGE] vectorized
- PARTITION_ONLY_SHUFFLE [RS_253]
- Group By Operator [GBY_252] (rows=1 width=12)
+ <-Map 22 [CUSTOM_SIMPLE_EDGE] vectorized
+ PARTITION_ONLY_SHUFFLE [RS_258]
+ Group By Operator [GBY_257] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=14398467)"]
- Select Operator [SEL_251] (rows=14398467 width=92)
+ Select Operator [SEL_256] (rows=14398467 width=92)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_249]
- <-Map 23 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_262]
+ Please refer to the previous Select Operator [SEL_254]
+ <-Map 21 [SIMPLE_EDGE] vectorized
+ SHUFFLE [RS_272]
PartitionCols:_col1
- Select Operator [SEL_261] (rows=144002668 width=135)
+ Select Operator [SEL_271] (rows=144002668 width=135)
Output:["_col0","_col1"]
- Filter Operator [FIL_260] (rows=144002668 width=135)
+ Filter Operator [FIL_270] (rows=144002668 width=135)
predicate:((ws_order_number BETWEEN DynamicValue(RS_42_web_returns_wr_order_number_min) AND DynamicValue(RS_42_web_returns_wr_order_number_max) and in_bloom_filter(ws_order_number, DynamicValue(RS_42_web_returns_wr_order_number_bloom_filter))) and ws_order_number is not null)
TableScan [TS_30] (rows=144002668 width=135)
default@web_sales,ws2,Tbl:COMPLETE,Col:NONE,Output:["ws_warehouse_sk","ws_order_number"]
- <-Reducer 25 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_256]
- Please refer to the previous Group By Operator [GBY_254]
+ <-Reducer 23 [BROADCAST_EDGE] vectorized
+ BROADCAST [RS_262]
+ Please refer to the previous Group By Operator [GBY_259]
+ <-Reducer 20 [ONE_TO_ONE_EDGE] vectorized
+ FORWARD [RS_277]
+ PartitionCols:_col0
+ Group By Operator [GBY_276] (rows=79201469 width=135)
+ Output:["_col0"],keys:KEY._col0
+ <-Reducer 19 [SIMPLE_EDGE]
+ SHUFFLE [RS_24]
+ PartitionCols:_col0
+ Group By Operator [GBY_23] (rows=158402938 width=135)
+ Output:["_col0"],keys:_col1
+ Select Operator [SEL_22] (rows=158402938 width=135)
+ Output:["_col1"]
+ Filter Operator [FIL_21] (rows=158402938 width=135)
+ predicate:(_col0 <> _col2)
+ Merge Join Operator [MERGEJOIN_225] (rows=158402938 width=135)
+ Conds:RS_269._col1=RS_273._col1(Inner),Output:["_col0","_col1","_col2"]
+ <-Map 15 [SIMPLE_EDGE] vectorized
+ SHUFFLE [RS_269]
+ PartitionCols:_col1
+ Please refer to the previous Select Operator [SEL_267]
+ <-Map 21 [SIMPLE_EDGE] vectorized
+ SHUFFLE [RS_273]
+ PartitionCols:_col1
+ Please refer to the previous Select Operator [SEL_271]
<-Reducer 4 [SIMPLE_EDGE]
SHUFFLE [RS_58]
PartitionCols:_col3
- Merge Join Operator [MERGEJOIN_208] (rows=191667562 width=135)
- Conds:RS_55._col2=RS_231._col0(Inner),Output:["_col3","_col4","_col5"]
+ Merge Join Operator [MERGEJOIN_224] (rows=191667562 width=135)
+ Conds:RS_55._col2=RS_247._col0(Inner),Output:["_col3","_col4","_col5"]
<-Map 13 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_231]
+ SHUFFLE [RS_247]
PartitionCols:_col0
- Select Operator [SEL_230] (rows=42 width=1850)
+ Select Operator [SEL_246] (rows=42 width=1850)
Output:["_col0"]
- Filter Operator [FIL_229] (rows=42 width=1850)
+ Filter Operator [FIL_245] (rows=42 width=1850)
predicate:((web_company_name = 'pri') and web_site_sk is not null)
TableScan [TS_9] (rows=84 width=1850)
default@web_site,web_site,Tbl:COMPLETE,Col:NONE,Output:["web_site_sk","web_company_name"]
<-Reducer 3 [SIMPLE_EDGE]
SHUFFLE [RS_55]
PartitionCols:_col2
- Merge Join Operator [MERGEJOIN_207] (rows=174243235 width=135)
- Conds:RS_52._col1=RS_223._col0(Inner),Output:["_col2","_col3","_col4","_col5"]
+ Merge Join Operator [MERGEJOIN_223] (rows=174243235 width=135)
+ Conds:RS_52._col1=RS_239._col0(Inner),Output:["_col2","_col3","_col4","_col5"]
<-Map 11 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_223]
+ SHUFFLE [RS_239]
PartitionCols:_col0
- Select Operator [SEL_222] (rows=20000000 width=1014)
+ Select Operator [SEL_238] (rows=20000000 width=1014)
Output:["_col0"]
- Filter Operator [FIL_221] (rows=20000000 width=1014)
+ Filter Operator [FIL_237] (rows=20000000 width=1014)
predicate:((ca_state = 'TX') and ca_address_sk is not null)
TableScan [TS_6] (rows=40000000 width=1014)
default@customer_address,customer_address,Tbl:COMPLETE,Col:NONE,Output:["ca_address_sk","ca_state"]
<-Reducer 2 [SIMPLE_EDGE]
SHUFFLE [RS_52]
PartitionCols:_col1
- Merge Join Operator [MERGEJOIN_206] (rows=158402938 width=135)
- Conds:RS_239._col0=RS_215._col0(Inner),Output:["_col1","_col2","_col3","_col4","_col5"]
+ Merge Join Operator [MERGEJOIN_222] (rows=158402938 width=135)
+ Conds:RS_265._col0=RS_231._col0(Inner),Output:["_col1","_col2","_col3","_col4","_col5"]
<-Map 9 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_215]
+ SHUFFLE [RS_231]
PartitionCols:_col0
- Select Operator [SEL_214] (rows=8116 width=1119)
+ Select Operator [SEL_230] (rows=8116 width=1119)
Output:["_col0"]
- Filter Operator [FIL_213] (rows=8116 width=1119)
+ Filter Operator [FIL_229] (rows=8116 width=1119)
predicate:(CAST( d_date AS TIMESTAMP) BETWEEN TIMESTAMP'1999-05-01 00:00:00' AND TIMESTAMP'1999-06-30 00:00:00' and d_date_sk is not null)
TableScan [TS_3] (rows=73049 width=1119)
default@date_dim,date_dim,Tbl:COMPLETE,Col:NONE,Output:["d_date_sk","d_date"]
<-Map 1 [SIMPLE_EDGE] vectorized
- SHUFFLE [RS_239]
+ SHUFFLE [RS_265]
PartitionCols:_col0
- Select Operator [SEL_238] (rows=144002668 width=135)
+ Select Operator [SEL_264] (rows=144002668 width=135)
Output:["_col0","_col1","_col2","_col3","_col4","_col5"]
- Filter Operator [FIL_237] (rows=144002668 width=135)
- predicate:((ws_ship_addr_sk BETWEEN DynamicValue(RS_53_customer_address_ca_address_sk_min) AND DynamicValue(RS_53_customer_address_ca_address_sk_max) and in_bloom_filter(ws_ship_addr_sk, DynamicValue(RS_53_customer_address_ca_address_sk_bloom_filter))) and (ws_ship_date_sk BETWEEN DynamicValue(RS_50_date_dim_d_date_sk_min) AND DynamicValue(RS_50_date_dim_d_date_sk_max) and in_bloom_filter(ws_ship_date_sk, DynamicValue(RS_50_date_dim_d_date_sk_bloom_filter))) and (ws_web_site_sk BETWEEN DynamicValue(RS_56_web_site_web_site_sk_min) AND DynamicValue(RS_56_web_site_web_site_sk_max) and in_bloom_filter(ws_web_site_sk, DynamicValue(RS_56_web_site_web_site_sk_bloom_filter))) and ws_order_number is not null and ws_ship_addr_sk is not null and ws_ship_date_sk is not null and ws_web_site_sk is not null)
+ Filter Operator [FIL_263] (rows=144002668 width=135)
+ predicate:((ws_order_number BETWEEN DynamicValue(RS_42_web_returns_wr_order_number_min) AND DynamicValue(RS_42_web_returns_wr_order_number_max) and in_bloom_filter(ws_order_number, DynamicValue(RS_42_web_returns_wr_order_number_bloom_filter))) and (ws_ship_addr_sk BETWEEN DynamicValue(RS_53_customer_address_ca_address_sk_min) AND DynamicValue(RS_53_customer_address_ca_address_sk_max) and in_bloom_filter(ws_ship_addr_sk, DynamicValue(RS_53_customer_address_ca_address_sk_bloom_filter))) and (ws_ship_date_sk BETWEEN DynamicValue(RS_50_date_dim_d_date_sk_min) AND DynamicValue(RS_50_date_dim_d_date_sk_max) and in_bloom_filter(ws_ship_date_sk, DynamicValue(RS_50_date_dim_d_date_sk_bloom_filter))) and (ws_web_site_sk BETWEEN DynamicValue(RS_56_web_site_web_site_sk_min) AND DynamicValue(RS_56_web_site_web_site_sk_max) and in_bloom_filter(ws_web_site_sk, DynamicValue(RS_56_web_site_web_site_sk_bloom_filter))) and ws_order_number is not null and
ws_ship_addr_sk is not null and ws_ship_date_sk is not null and ws_web_site_sk is not null)
TableScan [TS_0] (rows=144002668 width=135)
default@web_sales,ws1,Tbl:COMPLETE,Col:NONE,Output:["ws_ship_date_sk","ws_ship_addr_sk","ws_web_site_sk","ws_order_number","ws_ext_ship_cost","ws_net_profit"]
+ <-Reducer 23 [BROADCAST_EDGE] vectorized
+ BROADCAST [RS_260]
+ Please refer to the previous Group By Operator [GBY_259]
<-Reducer 10 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_220]
- Group By Operator [GBY_219] (rows=1 width=12)
+ BROADCAST [RS_236]
+ Group By Operator [GBY_235] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=1000000)"]
<-Map 9 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_218]
- Group By Operator [GBY_217] (rows=1 width=12)
+ SHUFFLE [RS_234]
+ Group By Operator [GBY_233] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=1000000)"]
- Select Operator [SEL_216] (rows=8116 width=1119)
+ Select Operator [SEL_232] (rows=8116 width=1119)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_214]
+ Please refer to the previous Select Operator [SEL_230]
<-Reducer 12 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_228]
- Group By Operator [GBY_227] (rows=1 width=12)
+ BROADCAST [RS_244]
+ Group By Operator [GBY_243] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=20000000)"]
<-Map 11 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_226]
- Group By Operator [GBY_225] (rows=1 width=12)
+ SHUFFLE [RS_242]
+ Group By Operator [GBY_241] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=20000000)"]
- Select Operator [SEL_224] (rows=20000000 width=1014)
+ Select Operator [SEL_240] (rows=20000000 width=1014)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_222]
+ Please refer to the previous Select Operator [SEL_238]
<-Reducer 14 [BROADCAST_EDGE] vectorized
- BROADCAST [RS_236]
- Group By Operator [GBY_235] (rows=1 width=12)
+ BROADCAST [RS_252]
+ Group By Operator [GBY_251] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(VALUE._col0)","max(VALUE._col1)","bloom_filter(VALUE._col2, expectedEntries=1000000)"]
<-Map 13 [CUSTOM_SIMPLE_EDGE] vectorized
- SHUFFLE [RS_234]
- Group By Operator [GBY_233] (rows=1 width=12)
+ SHUFFLE [RS_250]
+ Group By Operator [GBY_249] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=1000000)"]
- Select Operator [SEL_232] (rows=42 width=1850)
+ Select Operator [SEL_248] (rows=42 width=1850)
Output:["_col0"]
- Please refer to the previous Select Operator [SEL_230]
+ Please refer to the previous Select Operator [SEL_246]
http://git-wip-us.apache.org/repos/asf/hive/blob/ab9e954d/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out b/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out
index eafc1c4..a141409 100644
--- a/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out
+++ b/ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning_3.q.out
@@ -366,7 +366,7 @@ STAGE PLANS:
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
Spark Partition Pruning Sink Operator
- Target Columns: [Map 4 -> [part_col:int (part_col)]]
+ Target Columns: [Map 1 -> [part_col:int (part_col)], Map 4 -> [part_col:int (part_col)]]
Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE Column stats: NONE
Local Work:
Map Reduce Local Work
@@ -432,7 +432,6 @@ STAGE PLANS:
Map Operator Tree:
TableScan
alias: partitioned_table1
- filterExpr: (part_col > 1) (type: boolean)
Statistics: Num rows: 12 Data size: 12 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: part_col (type: int)