You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by ha...@apache.org on 2018/08/06 05:47:50 UTC
[5/9] hive git commit: HIVE-19097 : related equals and in operators
may cause inaccurate stats estimations (Zoltan Haindrich via Ashutosh
Chauhan)
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/spark/query85.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query85.q.out b/ql/src/test/results/clientpositive/perf/spark/query85.q.out
index 572ba54..d1b3a2c 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query85.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query85.q.out
@@ -166,8 +166,7 @@ limit 100
POSTHOOK: type: QUERY
STAGE DEPENDENCIES:
Stage-2 is a root stage
- Stage-3 depends on stages: Stage-2
- Stage-1 depends on stages: Stage-3
+ Stage-1 depends on stages: Stage-2
Stage-0 depends on stages: Stage-1
STAGE PLANS:
@@ -175,47 +174,42 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 13
+ Map 14
Map Operator Tree:
TableScan
- alias: reason
- filterExpr: r_reason_sk is not null (type: boolean)
- Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
+ alias: web_page
+ filterExpr: wp_web_page_sk is not null (type: boolean)
+ Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: r_reason_sk is not null (type: boolean)
- Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
+ predicate: wp_web_page_sk is not null (type: boolean)
+ Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: r_reason_sk (type: int), r_reason_desc (type: string)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
+ expressions: wp_web_page_sk (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
- 0 _col4 (type: int)
+ 0 _col2 (type: int)
1 _col0 (type: int)
Execution mode: vectorized
Local Work:
Map Reduce Local Work
-
- Stage: Stage-3
- Spark
-#### A masked pattern was here ####
- Vertices:
- Map 11
+ Map 15
Map Operator Tree:
TableScan
- alias: web_page
- filterExpr: wp_web_page_sk is not null (type: boolean)
- Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
+ alias: reason
+ filterExpr: r_reason_sk is not null (type: boolean)
+ Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: wp_web_page_sk is not null (type: boolean)
- Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
+ predicate: r_reason_sk is not null (type: boolean)
+ Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: wp_web_page_sk (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: NONE
+ expressions: r_reason_sk (type: int), r_reason_desc (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
- 0 _col10 (type: int)
+ 0 _col13 (type: int)
1 _col0 (type: int)
Execution mode: vectorized
Local Work:
@@ -224,18 +218,38 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 28), Map 9 (PARTITION-LEVEL SORT, 28)
- Reducer 3 <- Map 10 (PARTITION-LEVEL SORT, 98), Reducer 2 (PARTITION-LEVEL SORT, 98)
- Reducer 4 <- Map 12 (PARTITION-LEVEL SORT, 5), Reducer 3 (PARTITION-LEVEL SORT, 5)
- Reducer 5 <- Map 14 (PARTITION-LEVEL SORT, 11), Reducer 4 (PARTITION-LEVEL SORT, 11)
- Reducer 6 <- Map 15 (PARTITION-LEVEL SORT, 7), Reducer 5 (PARTITION-LEVEL SORT, 7)
- Reducer 7 <- Reducer 6 (GROUP, 7)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 52), Map 9 (PARTITION-LEVEL SORT, 52)
+ Reducer 3 <- Map 10 (PARTITION-LEVEL SORT, 67), Reducer 2 (PARTITION-LEVEL SORT, 67)
+ Reducer 4 <- Map 11 (PARTITION-LEVEL SORT, 68), Reducer 3 (PARTITION-LEVEL SORT, 68)
+ Reducer 5 <- Map 12 (PARTITION-LEVEL SORT, 12), Reducer 4 (PARTITION-LEVEL SORT, 12)
+ Reducer 6 <- Map 13 (PARTITION-LEVEL SORT, 165), Reducer 5 (PARTITION-LEVEL SORT, 165)
+ Reducer 7 <- Reducer 6 (GROUP, 71)
Reducer 8 <- Reducer 7 (SORT, 1)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
+ alias: web_sales
+ filterExpr: ((ws_sales_price BETWEEN 100 AND 150 or ws_sales_price BETWEEN 50 AND 100 or ws_sales_price BETWEEN 150 AND 200) and ws_item_sk is not null and ws_order_number is not null and ws_web_page_sk is not null and ws_sold_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 144002668 Data size: 19580198212 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((ws_sales_price BETWEEN 100 AND 150 or ws_sales_price BETWEEN 50 AND 100 or ws_sales_price BETWEEN 150 AND 200) and ws_item_sk is not null and ws_order_number is not null and ws_sold_date_sk is not null and ws_web_page_sk is not null) (type: boolean)
+ Statistics: Num rows: 48000888 Data size: 6526732556 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: ws_sold_date_sk (type: int), ws_item_sk (type: int), ws_web_page_sk (type: int), ws_order_number (type: int), ws_quantity (type: int), ws_sales_price (type: decimal(7,2)), ws_net_profit (type: decimal(7,2))
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
+ Statistics: Num rows: 48000888 Data size: 6526732556 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 48000888 Data size: 6526732556 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: decimal(7,2)), _col6 (type: decimal(7,2))
+ Execution mode: vectorized
+ Map 10
+ Map Operator Tree:
+ TableScan
alias: web_returns
filterExpr: (wr_item_sk is not null and wr_order_number is not null and wr_refunded_cdemo_sk is not null and wr_returning_cdemo_sk is not null and wr_refunded_addr_sk is not null and wr_reason_sk is not null) (type: boolean)
Statistics: Num rows: 14398467 Data size: 1325194184 Basic stats: COMPLETE Column stats: NONE
@@ -253,53 +267,14 @@ STAGE PLANS:
Statistics: Num rows: 14398467 Data size: 1325194184 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2))
Execution mode: vectorized
- Map 10
- Map Operator Tree:
- TableScan
- alias: customer_address
- filterExpr: ((ca_state) IN ('KY', 'GA', 'NM', 'MT', 'OR', 'IN', 'WI', 'MO', 'WV') and (ca_country = 'United States') and ca_address_sk is not null) (type: boolean)
- Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((ca_country = 'United States') and (ca_state) IN ('KY', 'GA', 'NM', 'MT', 'OR', 'IN', 'WI', 'MO', 'WV') and ca_address_sk is not null) (type: boolean)
- Statistics: Num rows: 10000000 Data size: 10148798821 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: ca_address_sk (type: int), ca_state (type: string)
- outputColumnNames: _col0, _col1
- Statistics: Num rows: 10000000 Data size: 10148798821 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 10000000 Data size: 10148798821 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string)
- Execution mode: vectorized
- Map 12
- Map Operator Tree:
- TableScan
- alias: date_dim
- filterExpr: ((d_year = 1998) and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((d_year = 1998) and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: d_date_sk (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: int)
- sort order: +
- Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
- Execution mode: vectorized
- Map 14
+ Map 11
Map Operator Tree:
TableScan
alias: cd1
- filterExpr: (((cd_education_status = '4 yr Degree') or (cd_education_status = 'Primary') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'D') or (cd_marital_status = 'U')) and cd_demo_sk is not null and cd_marital_status is not null and cd_education_status is not null) (type: boolean)
+ filterExpr: ((cd_education_status) IN ('4 yr Degree', 'Primary', 'Advanced Degree') and (cd_marital_status) IN ('M', 'D', 'U') and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (((cd_education_status = '4 yr Degree') or (cd_education_status = 'Primary') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'D') or (cd_marital_status = 'U')) and cd_demo_sk is not null and cd_education_status is not null and cd_marital_status is not null) (type: boolean)
+ predicate: ((cd_education_status) IN ('4 yr Degree', 'Primary', 'Advanced Degree') and (cd_marital_status) IN ('M', 'D', 'U') and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string)
@@ -312,14 +287,14 @@ STAGE PLANS:
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string), _col2 (type: string)
Execution mode: vectorized
- Map 15
+ Map 12
Map Operator Tree:
TableScan
alias: cd2
- filterExpr: (((cd_education_status = '4 yr Degree') or (cd_education_status = 'Primary') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'D') or (cd_marital_status = 'U')) and cd_demo_sk is not null and cd_marital_status is not null and cd_education_status is not null) (type: boolean)
+ filterExpr: ((cd_education_status) IN ('4 yr Degree', 'Primary', 'Advanced Degree') and (cd_marital_status) IN ('M', 'D', 'U') and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (((cd_education_status = '4 yr Degree') or (cd_education_status = 'Primary') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'D') or (cd_marital_status = 'U')) and cd_demo_sk is not null and cd_education_status is not null and cd_marital_status is not null) (type: boolean)
+ predicate: ((cd_education_status) IN ('4 yr Degree', 'Primary', 'Advanced Degree') and (cd_marital_status) IN ('M', 'D', 'U') and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string)
@@ -331,25 +306,44 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int), _col1 (type: string), _col2 (type: string)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
+ Map 13
+ Map Operator Tree:
+ TableScan
+ alias: customer_address
+ filterExpr: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean)
+ Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean)
+ Statistics: Num rows: 20000000 Data size: 20297597642 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: ca_address_sk (type: int), ca_state (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 20000000 Data size: 20297597642 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 20000000 Data size: 20297597642 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: string)
+ Execution mode: vectorized
Map 9
Map Operator Tree:
TableScan
- alias: web_sales
- filterExpr: ((ws_sales_price BETWEEN 100 AND 150 or ws_sales_price BETWEEN 50 AND 100 or ws_sales_price BETWEEN 150 AND 200) and (ws_net_profit BETWEEN 100 AND 200 or ws_net_profit BETWEEN 150 AND 300 or ws_net_profit BETWEEN 50 AND 250) and ws_item_sk is not null and ws_order_number is not null and ws_web_page_sk is not null and ws_sold_date_sk is not null) (type: boolean)
- Statistics: Num rows: 144002668 Data size: 19580198212 Basic stats: COMPLETE Column stats: NONE
+ alias: date_dim
+ filterExpr: ((d_year = 1998) and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((ws_net_profit BETWEEN 100 AND 200 or ws_net_profit BETWEEN 150 AND 300 or ws_net_profit BETWEEN 50 AND 250) and (ws_sales_price BETWEEN 100 AND 150 or ws_sales_price BETWEEN 50 AND 100 or ws_sales_price BETWEEN 150 AND 200) and ws_item_sk is not null and ws_order_number is not null and ws_sold_date_sk is not null and ws_web_page_sk is not null) (type: boolean)
- Statistics: Num rows: 16000296 Data size: 2175577518 Basic stats: COMPLETE Column stats: NONE
+ predicate: ((d_year = 1998) and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: ws_sold_date_sk (type: int), ws_item_sk (type: int), ws_web_page_sk (type: int), ws_order_number (type: int), ws_quantity (type: int), ws_sales_price (type: decimal(7,2)), ws_net_profit (type: decimal(7,2))
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
- Statistics: Num rows: 16000296 Data size: 2175577518 Basic stats: COMPLETE Column stats: NONE
+ expressions: d_date_sk (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col1 (type: int), _col3 (type: int)
- sort order: ++
- Map-reduce partition columns: _col1 (type: int), _col3 (type: int)
- Statistics: Num rows: 16000296 Data size: 2175577518 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col2 (type: int), _col4 (type: int), _col5 (type: decimal(7,2)), _col6 (type: decimal(7,2))
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
Reducer 2
Reduce Operator Tree:
@@ -357,116 +351,114 @@ STAGE PLANS:
condition map:
Inner Join 0 to 1
keys:
- 0 _col0 (type: int), _col5 (type: int)
- 1 _col1 (type: int), _col3 (type: int)
- outputColumnNames: _col1, _col2, _col3, _col4, _col6, _col7, _col8, _col10, _col12, _col13, _col14
- Statistics: Num rows: 17600325 Data size: 2393135321 Basic stats: COMPLETE Column stats: NONE
+ 0 _col0 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col1, _col2, _col3, _col4, _col5, _col6
+ Statistics: Num rows: 52800977 Data size: 7179405967 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col2 (type: int)
- sort order: +
- Map-reduce partition columns: _col2 (type: int)
- Statistics: Num rows: 17600325 Data size: 2393135321 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int), _col3 (type: int), _col4 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2)), _col8 (type: int), _col10 (type: int), _col12 (type: int), _col13 (type: decimal(7,2)), _col14 (type: decimal(7,2))
+ key expressions: _col1 (type: int), _col3 (type: int)
+ sort order: ++
+ Map-reduce partition columns: _col1 (type: int), _col3 (type: int)
+ Statistics: Num rows: 52800977 Data size: 7179405967 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: int), _col4 (type: int), _col5 (type: decimal(7,2)), _col6 (type: decimal(7,2))
Reducer 3
- Local Work:
- Map Reduce Local Work
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col2 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col3, _col4, _col6, _col7, _col8, _col10, _col12, _col13, _col14, _col16
- Statistics: Num rows: 19360357 Data size: 2632448910 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: (((_col16) IN ('KY', 'GA', 'NM') and _col14 BETWEEN 100 AND 200) or ((_col16) IN ('MT', 'OR', 'IN') and _col14 BETWEEN 150 AND 300) or ((_col16) IN ('WI', 'MO', 'WV') and _col14 BETWEEN 50 AND 250)) (type: boolean)
- Statistics: Num rows: 3226725 Data size: 438741326 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col10 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col3, _col4, _col6, _col7, _col8, _col12, _col13
- input vertices:
- 1 Map 11
- Statistics: Num rows: 3549397 Data size: 482615469 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col8 (type: int)
- sort order: +
- Map-reduce partition columns: _col8 (type: int)
- Statistics: Num rows: 3549397 Data size: 482615469 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int), _col3 (type: int), _col4 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2)), _col12 (type: int), _col13 (type: decimal(7,2))
+ 0 _col1 (type: int), _col3 (type: int)
+ 1 _col0 (type: int), _col5 (type: int)
+ outputColumnNames: _col2, _col4, _col5, _col6, _col10, _col11, _col12, _col13, _col15, _col16
+ Statistics: Num rows: 58081075 Data size: 7897346734 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col10 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col10 (type: int)
+ Statistics: Num rows: 58081075 Data size: 7897346734 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: int), _col4 (type: int), _col5 (type: decimal(7,2)), _col6 (type: decimal(7,2)), _col11 (type: int), _col12 (type: int), _col13 (type: int), _col15 (type: decimal(7,2)), _col16 (type: decimal(7,2))
Reducer 4
- Local Work:
- Map Reduce Local Work
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col8 (type: int)
+ 0 _col10 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col1, _col3, _col4, _col6, _col7, _col12, _col13
- Statistics: Num rows: 3904336 Data size: 530877027 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col4 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col1, _col3, _col6, _col7, _col12, _col13, _col22
- input vertices:
- 1 Map 13
- Statistics: Num rows: 4294769 Data size: 583964742 Basic stats: COMPLETE Column stats: NONE
+ outputColumnNames: _col2, _col4, _col5, _col6, _col11, _col12, _col13, _col15, _col16, _col18, _col19
+ Statistics: Num rows: 63889183 Data size: 8687081595 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: (((_col18 = 'D') and (_col19 = 'Primary') and _col5 BETWEEN 50 AND 100) or ((_col18 = 'M') and (_col19 = '4 yr Degree') and _col5 BETWEEN 100 AND 150) or ((_col18 = 'U') and (_col19 = 'Advanced Degree') and _col5 BETWEEN 150 AND 200)) (type: boolean)
+ Statistics: Num rows: 5324097 Data size: 723923250 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col1 (type: int)
- sort order: +
- Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 4294769 Data size: 583964742 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col3 (type: int), _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2)), _col12 (type: int), _col13 (type: decimal(7,2)), _col22 (type: string)
+ key expressions: _col12 (type: int), _col18 (type: string), _col19 (type: string)
+ sort order: +++
+ Map-reduce partition columns: _col12 (type: int), _col18 (type: string), _col19 (type: string)
+ Statistics: Num rows: 5324097 Data size: 723923250 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: int), _col4 (type: int), _col6 (type: decimal(7,2)), _col11 (type: int), _col13 (type: int), _col15 (type: decimal(7,2)), _col16 (type: decimal(7,2))
Reducer 5
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col1 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col3, _col6, _col7, _col12, _col13, _col22, _col24, _col25
- Statistics: Num rows: 4724246 Data size: 642361230 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: (((_col24 = 'D') and (_col25 = 'Primary') and _col13 BETWEEN 50 AND 100) or ((_col24 = 'M') and (_col25 = '4 yr Degree') and _col13 BETWEEN 100 AND 150) or ((_col24 = 'U') and (_col25 = 'Advanced Degree') and _col13 BETWEEN 150 AND 200)) (type: boolean)
- Statistics: Num rows: 393687 Data size: 53530079 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col3 (type: int), _col24 (type: string), _col25 (type: string)
- sort order: +++
- Map-reduce partition columns: _col3 (type: int), _col24 (type: string), _col25 (type: string)
- Statistics: Num rows: 393687 Data size: 53530079 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col6 (type: decimal(7,2)), _col7 (type: decimal(7,2)), _col12 (type: int), _col22 (type: string)
+ 0 _col12 (type: int), _col18 (type: string), _col19 (type: string)
+ 1 _col0 (type: int), _col1 (type: string), _col2 (type: string)
+ outputColumnNames: _col2, _col4, _col6, _col11, _col13, _col15, _col16
+ Statistics: Num rows: 5856506 Data size: 796315592 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col11 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col11 (type: int)
+ Statistics: Num rows: 5856506 Data size: 796315592 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: int), _col4 (type: int), _col6 (type: decimal(7,2)), _col13 (type: int), _col15 (type: decimal(7,2)), _col16 (type: decimal(7,2))
Reducer 6
+ Local Work:
+ Map Reduce Local Work
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col3 (type: int), _col24 (type: string), _col25 (type: string)
- 1 _col0 (type: int), _col1 (type: string), _col2 (type: string)
- outputColumnNames: _col6, _col7, _col12, _col22
- Statistics: Num rows: 2047980 Data size: 788904791 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: sum(_col12), count(_col12), sum(_col7), count(_col7), sum(_col6), count(_col6)
- keys: _col22 (type: string)
- mode: hash
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
- Statistics: Num rows: 2047980 Data size: 788904791 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string)
- sort order: +
- Map-reduce partition columns: _col0 (type: string)
- Statistics: Num rows: 2047980 Data size: 788904791 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(17,2)), _col4 (type: bigint), _col5 (type: decimal(17,2)), _col6 (type: bigint)
+ 0 _col11 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col2, _col4, _col6, _col13, _col15, _col16, _col24
+ Statistics: Num rows: 22000000 Data size: 22327357890 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((((_col24 = 'KY') or (_col24 = 'GA') or (_col24 = 'NM')) and _col6 BETWEEN 100 AND 200) or (((_col24 = 'MT') or (_col24 = 'OR') or (_col24 = 'IN')) and _col6 BETWEEN 150 AND 300) or (((_col24 = 'WI') or (_col24 = 'MO') or (_col24 = 'WV')) and _col6 BETWEEN 50 AND 250)) (type: boolean)
+ Statistics: Num rows: 7333332 Data size: 7442451276 Basic stats: COMPLETE Column stats: NONE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col2 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col4, _col13, _col15, _col16
+ input vertices:
+ 1 Map 14
+ Statistics: Num rows: 8066665 Data size: 8186696581 Basic stats: COMPLETE Column stats: NONE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col13 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col4, _col15, _col16, _col28
+ input vertices:
+ 1 Map 15
+ Statistics: Num rows: 8873331 Data size: 9005366434 Basic stats: COMPLETE Column stats: NONE
+ Group By Operator
+ aggregations: sum(_col4), count(_col4), sum(_col16), count(_col16), sum(_col15), count(_col15)
+ keys: _col28 (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
+ Statistics: Num rows: 8873331 Data size: 9005366434 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: string)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: string)
+ Statistics: Num rows: 8873331 Data size: 9005366434 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(17,2)), _col4 (type: bigint), _col5 (type: decimal(17,2)), _col6 (type: bigint)
Reducer 7
Execution mode: vectorized
Reduce Operator Tree:
@@ -475,15 +467,15 @@ STAGE PLANS:
keys: KEY._col0 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
- Statistics: Num rows: 1023990 Data size: 394452395 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 4436665 Data size: 4502682709 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: (_col1 / _col2) (type: double), (_col3 / _col4) (type: decimal(37,22)), (_col5 / _col6) (type: decimal(37,22)), substr(_col0, 1, 20) (type: string)
outputColumnNames: _col4, _col5, _col6, _col7
- Statistics: Num rows: 1023990 Data size: 394452395 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 4436665 Data size: 4502682709 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col7 (type: string), _col4 (type: double), _col5 (type: decimal(37,22)), _col6 (type: decimal(37,22))
sort order: ++++
- Statistics: Num rows: 1023990 Data size: 394452395 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 4436665 Data size: 4502682709 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
Reducer 8
Execution mode: vectorized
@@ -491,13 +483,13 @@ STAGE PLANS:
Select Operator
expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: double), KEY.reducesinkkey2 (type: decimal(37,22)), KEY.reducesinkkey3 (type: decimal(37,22))
outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 1023990 Data size: 394452395 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 4436665 Data size: 4502682709 Basic stats: COMPLETE Column stats: NONE
Limit
Number of rows: 100
- Statistics: Num rows: 100 Data size: 38500 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 100 Data size: 101400 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- Statistics: Num rows: 100 Data size: 38500 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 100 Data size: 101400 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/spark/query89.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query89.q.out b/ql/src/test/results/clientpositive/perf/spark/query89.q.out
index 1acc577..203a141 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query89.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query89.q.out
@@ -86,8 +86,8 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 400), Map 7 (PARTITION-LEVEL SORT, 400)
- Reducer 3 <- Map 8 (PARTITION-LEVEL SORT, 438), Reducer 2 (PARTITION-LEVEL SORT, 438)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 398), Map 7 (PARTITION-LEVEL SORT, 398)
+ Reducer 3 <- Map 8 (PARTITION-LEVEL SORT, 442), Reducer 2 (PARTITION-LEVEL SORT, 442)
Reducer 4 <- Reducer 3 (GROUP, 529)
Reducer 5 <- Reducer 4 (PARTITION-LEVEL SORT, 265)
Reducer 6 <- Reducer 5 (SORT, 1)
@@ -107,33 +107,13 @@ STAGE PLANS:
outputColumnNames: _col0, _col1, _col2, _col3
Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col1 (type: int)
- sort order: +
- Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: decimal(7,2))
- Execution mode: vectorized
- Map 7
- Map Operator Tree:
- TableScan
- alias: item
- filterExpr: (((i_class) IN ('wallpaper', 'parenting', 'musical') or (i_class) IN ('womens', 'birdal', 'pants')) and ((i_category) IN ('Home', 'Books', 'Electronics') or (i_category) IN ('Shoes', 'Jewelry', 'Men')) and (((i_category) IN ('Home', 'Books', 'Electronics') and (i_class) IN ('wallpaper', 'parenting', 'musical')) or ((i_category) IN ('Shoes', 'Jewelry', 'Men') and (i_class) IN ('womens', 'birdal', 'pants'))) and i_item_sk is not null) (type: boolean)
- Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: ((((i_category) IN ('Home', 'Books', 'Electronics') and (i_class) IN ('wallpaper', 'parenting', 'musical')) or ((i_category) IN ('Shoes', 'Jewelry', 'Men') and (i_class) IN ('womens', 'birdal', 'pants'))) and ((i_category) IN ('Home', 'Books', 'Electronics') or (i_category) IN ('Shoes', 'Jewelry', 'Men')) and ((i_class) IN ('wallpaper', 'parenting', 'musical') or (i_class) IN ('womens', 'birdal', 'pants')) and i_item_sk is not null) (type: boolean)
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: i_item_sk (type: int), i_brand (type: string), i_class (type: string), i_category (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string)
+ Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: decimal(7,2))
Execution mode: vectorized
- Map 8
+ Map 7
Map Operator Tree:
TableScan
alias: date_dim
@@ -153,22 +133,42 @@ STAGE PLANS:
Statistics: Num rows: 36524 Data size: 40870356 Basic stats: COMPLETE Column stats: NONE
value expressions: _col2 (type: int)
Execution mode: vectorized
+ Map 8
+ Map Operator Tree:
+ TableScan
+ alias: item
+ filterExpr: ((((i_category) IN ('Home', 'Books', 'Electronics') and (i_class) IN ('wallpaper', 'parenting', 'musical')) or ((i_category) IN ('Shoes', 'Jewelry', 'Men') and (i_class) IN ('womens', 'birdal', 'pants'))) and i_item_sk is not null) (type: boolean)
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((((i_category) IN ('Home', 'Books', 'Electronics') and (i_class) IN ('wallpaper', 'parenting', 'musical')) or ((i_category) IN ('Shoes', 'Jewelry', 'Men') and (i_class) IN ('womens', 'birdal', 'pants'))) and i_item_sk is not null) (type: boolean)
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: i_item_sk (type: int), i_brand (type: string), i_class (type: string), i_category (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: string)
+ Execution mode: vectorized
Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
- 0 _col1 (type: int)
+ 0 _col0 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col0, _col2, _col3, _col5, _col6, _col7
+ outputColumnNames: _col1, _col2, _col3, _col6
Statistics: Num rows: 633595212 Data size: 55895953508 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col0 (type: int)
+ key expressions: _col1 (type: int)
sort order: +
- Map-reduce partition columns: _col0 (type: int)
+ Map-reduce partition columns: _col1 (type: int)
Statistics: Num rows: 633595212 Data size: 55895953508 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col2 (type: int), _col3 (type: decimal(7,2)), _col5 (type: string), _col6 (type: string), _col7 (type: string)
+ value expressions: _col2 (type: int), _col3 (type: decimal(7,2)), _col6 (type: int)
Reducer 3
Local Work:
Map Reduce Local Work
@@ -177,9 +177,9 @@ STAGE PLANS:
condition map:
Inner Join 0 to 1
keys:
- 0 _col0 (type: int)
+ 0 _col1 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col2, _col3, _col5, _col6, _col7, _col10
+ outputColumnNames: _col2, _col3, _col6, _col8, _col9, _col10
Statistics: Num rows: 696954748 Data size: 61485550191 Basic stats: COMPLETE Column stats: NONE
Map Join Operator
condition map:
@@ -187,20 +187,20 @@ STAGE PLANS:
keys:
0 _col2 (type: int)
1 _col0 (type: int)
- outputColumnNames: _col3, _col5, _col6, _col7, _col10, _col12, _col13
+ outputColumnNames: _col3, _col6, _col8, _col9, _col10, _col12, _col13
input vertices:
1 Map 9
Statistics: Num rows: 766650239 Data size: 67634106676 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: sum(_col3)
- keys: _col5 (type: string), _col6 (type: string), _col7 (type: string), _col10 (type: int), _col12 (type: string), _col13 (type: string)
+ keys: _col6 (type: int), _col8 (type: string), _col9 (type: string), _col10 (type: string), _col12 (type: string), _col13 (type: string)
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
Statistics: Num rows: 766650239 Data size: 67634106676 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: int), _col4 (type: string), _col5 (type: string)
+ key expressions: _col0 (type: int), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string)
sort order: ++++++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: int), _col4 (type: string), _col5 (type: string)
+ Map-reduce partition columns: _col0 (type: int), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string)
Statistics: Num rows: 766650239 Data size: 67634106676 Basic stats: COMPLETE Column stats: NONE
value expressions: _col6 (type: decimal(17,2))
Reducer 4
@@ -208,33 +208,33 @@ STAGE PLANS:
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0)
- keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: int), KEY._col4 (type: string), KEY._col5 (type: string)
+ keys: KEY._col0 (type: int), KEY._col1 (type: string), KEY._col2 (type: string), KEY._col3 (type: string), KEY._col4 (type: string), KEY._col5 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
Statistics: Num rows: 383325119 Data size: 33817053293 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
- key expressions: _col2 (type: string), _col0 (type: string), _col4 (type: string), _col5 (type: string)
+ key expressions: _col3 (type: string), _col1 (type: string), _col4 (type: string), _col5 (type: string)
sort order: ++++
- Map-reduce partition columns: _col2 (type: string), _col0 (type: string), _col4 (type: string), _col5 (type: string)
+ Map-reduce partition columns: _col3 (type: string), _col1 (type: string), _col4 (type: string), _col5 (type: string)
Statistics: Num rows: 383325119 Data size: 33817053293 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: string), _col3 (type: int), _col6 (type: decimal(17,2))
+ value expressions: _col0 (type: int), _col2 (type: string), _col6 (type: decimal(17,2))
Reducer 5
Reduce Operator Tree:
Select Operator
- expressions: KEY.reducesinkkey1 (type: string), VALUE._col0 (type: string), KEY.reducesinkkey0 (type: string), VALUE._col1 (type: int), KEY.reducesinkkey2 (type: string), KEY.reducesinkkey3 (type: string), VALUE._col2 (type: decimal(17,2))
+ expressions: VALUE._col0 (type: int), KEY.reducesinkkey1 (type: string), VALUE._col1 (type: string), KEY.reducesinkkey0 (type: string), KEY.reducesinkkey2 (type: string), KEY.reducesinkkey3 (type: string), VALUE._col2 (type: decimal(17,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6
Statistics: Num rows: 383325119 Data size: 33817053293 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
Input definition
input alias: ptf_0
- output shape: _col0: string, _col1: string, _col2: string, _col3: int, _col4: string, _col5: string, _col6: decimal(17,2)
+ output shape: _col0: int, _col1: string, _col2: string, _col3: string, _col4: string, _col5: string, _col6: decimal(17,2)
type: WINDOWING
Windowing table definition
input alias: ptf_1
name: windowingtablefunction
- order by: _col2 ASC NULLS FIRST, _col0 ASC NULLS FIRST, _col4 ASC NULLS FIRST, _col5 ASC NULLS FIRST
- partition by: _col2, _col0, _col4, _col5
+ order by: _col3 ASC NULLS FIRST, _col1 ASC NULLS FIRST, _col4 ASC NULLS FIRST, _col5 ASC NULLS FIRST
+ partition by: _col3, _col1, _col4, _col5
raw input shape:
window functions:
window function definition
@@ -245,14 +245,14 @@ STAGE PLANS:
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
Statistics: Num rows: 383325119 Data size: 33817053293 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: avg_window_0 (type: decimal(21,6)), _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: int), _col4 (type: string), _col5 (type: string), _col6 (type: decimal(17,2))
+ expressions: avg_window_0 (type: decimal(21,6)), _col0 (type: int), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: string), _col6 (type: decimal(17,2))
outputColumnNames: avg_window_0, _col0, _col1, _col2, _col3, _col4, _col5, _col6
Statistics: Num rows: 383325119 Data size: 33817053293 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: CASE WHEN ((avg_window_0 <> 0)) THEN (((abs((_col6 - avg_window_0)) / avg_window_0) > 0.1)) ELSE (null) END (type: boolean)
Statistics: Num rows: 191662559 Data size: 16908526602 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: _col2 (type: string), _col1 (type: string), _col0 (type: string), _col4 (type: string), _col5 (type: string), _col3 (type: int), _col6 (type: decimal(17,2)), avg_window_0 (type: decimal(21,6)), (_col6 - avg_window_0) (type: decimal(22,6))
+ expressions: _col3 (type: string), _col2 (type: string), _col1 (type: string), _col4 (type: string), _col5 (type: string), _col0 (type: int), _col6 (type: decimal(17,2)), avg_window_0 (type: decimal(21,6)), (_col6 - avg_window_0) (type: decimal(22,6))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8
Statistics: Num rows: 191662559 Data size: 16908526602 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/spark/query91.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query91.q.out b/ql/src/test/results/clientpositive/perf/spark/query91.q.out
index de8977d..78f85ac 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query91.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query91.q.out
@@ -69,19 +69,19 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 7
+ Map 13
Map Operator Tree:
TableScan
- alias: call_center
- filterExpr: cc_call_center_sk is not null (type: boolean)
- Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
+ alias: household_demographics
+ filterExpr: ((hd_buy_potential like '0-500%') and hd_demo_sk is not null) (type: boolean)
+ Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: cc_call_center_sk is not null (type: boolean)
- Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
+ predicate: ((hd_buy_potential like '0-500%') and hd_demo_sk is not null) (type: boolean)
+ Statistics: Num rows: 3600 Data size: 385200 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: cc_call_center_sk (type: int), cc_call_center_id (type: string), cc_name (type: string), cc_manager (type: string)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
+ expressions: hd_demo_sk (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 3600 Data size: 385200 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
0 _col2 (type: int)
@@ -94,19 +94,19 @@ STAGE PLANS:
Spark
#### A masked pattern was here ####
Vertices:
- Map 13
+ Map 12
Map Operator Tree:
TableScan
- alias: household_demographics
- filterExpr: ((hd_buy_potential like '0-500%') and hd_demo_sk is not null) (type: boolean)
- Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: NONE
+ alias: call_center
+ filterExpr: cc_call_center_sk is not null (type: boolean)
+ Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((hd_buy_potential like '0-500%') and hd_demo_sk is not null) (type: boolean)
- Statistics: Num rows: 3600 Data size: 385200 Basic stats: COMPLETE Column stats: NONE
+ predicate: cc_call_center_sk is not null (type: boolean)
+ Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: hd_demo_sk (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 3600 Data size: 385200 Basic stats: COMPLETE Column stats: NONE
+ expressions: cc_call_center_sk (type: int), cc_call_center_id (type: string), cc_name (type: string), cc_manager (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 60 Data size: 122700 Basic stats: COMPLETE Column stats: NONE
Spark HashTable Sink Operator
keys:
0 _col2 (type: int)
@@ -118,42 +118,61 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 10 <- Map 12 (PARTITION-LEVEL SORT, 750), Reducer 9 (PARTITION-LEVEL SORT, 750)
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 25), Map 6 (PARTITION-LEVEL SORT, 25)
- Reducer 3 <- Reducer 10 (PARTITION-LEVEL SORT, 745), Reducer 2 (PARTITION-LEVEL SORT, 745)
- Reducer 4 <- Reducer 3 (GROUP, 787)
- Reducer 5 <- Reducer 4 (SORT, 1)
- Reducer 9 <- Map 11 (PARTITION-LEVEL SORT, 541), Map 8 (PARTITION-LEVEL SORT, 541)
+ Reducer 10 <- Map 11 (PARTITION-LEVEL SORT, 25), Map 9 (PARTITION-LEVEL SORT, 25)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 541), Map 7 (PARTITION-LEVEL SORT, 541)
+ Reducer 3 <- Map 8 (PARTITION-LEVEL SORT, 750), Reducer 2 (PARTITION-LEVEL SORT, 750)
+ Reducer 4 <- Reducer 10 (PARTITION-LEVEL SORT, 680), Reducer 3 (PARTITION-LEVEL SORT, 680)
+ Reducer 5 <- Reducer 4 (GROUP, 787)
+ Reducer 6 <- Reducer 5 (SORT, 1)
#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
TableScan
- alias: catalog_returns
- filterExpr: (cr_call_center_sk is not null and cr_returned_date_sk is not null and cr_returning_customer_sk is not null) (type: boolean)
- Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
+ alias: customer
+ filterExpr: (c_customer_sk is not null and c_current_addr_sk is not null and c_current_cdemo_sk is not null and c_current_hdemo_sk is not null) (type: boolean)
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: (cr_call_center_sk is not null and cr_returned_date_sk is not null and cr_returning_customer_sk is not null) (type: boolean)
- Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
+ predicate: (c_current_addr_sk is not null and c_current_cdemo_sk is not null and c_current_hdemo_sk is not null and c_customer_sk is not null) (type: boolean)
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: cr_returned_date_sk (type: int), cr_returning_customer_sk (type: int), cr_call_center_sk (type: int), cr_net_loss (type: decimal(7,2))
+ expressions: c_customer_sk (type: int), c_current_cdemo_sk (type: int), c_current_hdemo_sk (type: int), c_current_addr_sk (type: int)
outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col1 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col1 (type: int)
+ Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: int)
+ Execution mode: vectorized
+ Map 11
+ Map Operator Tree:
+ TableScan
+ alias: date_dim
+ filterExpr: ((d_year = 1999) and (d_moy = 11) and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
+ Filter Operator
+ predicate: ((d_moy = 11) and (d_year = 1999) and d_date_sk is not null) (type: boolean)
+ Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
+ Select Operator
+ expressions: d_date_sk (type: int)
+ outputColumnNames: _col0
+ Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: decimal(7,2))
+ Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
- Map 11
+ Map 7
Map Operator Tree:
TableScan
alias: customer_demographics
- filterExpr: (((cd_education_status = 'Unknown') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'W')) and (((cd_marital_status = 'M') and (cd_education_status = 'Unknown')) or ((cd_marital_status = 'W') and (cd_education_status = 'Advanced Degree'))) and cd_demo_sk is not null) (type: boolean)
+ filterExpr: ((cd_education_status) IN ('Unknown', 'Advanced Degree') and (cd_marital_status) IN ('M', 'W') and (struct(cd_marital_status,cd_education_status)) IN (const struct('M','Unknown'), const struct('W','Advanced Degree')) and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 1861800 Data size: 717186159 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((((cd_marital_status = 'M') and (cd_education_status = 'Unknown')) or ((cd_marital_status = 'W') and (cd_education_status = 'Advanced Degree'))) and ((cd_education_status = 'Unknown') or (cd_education_status = 'Advanced Degree')) and ((cd_marital_status = 'M') or (cd_marital_status = 'W')) and cd_demo_sk is not null) (type: boolean)
+ predicate: ((cd_education_status) IN ('Unknown', 'Advanced Degree') and (cd_marital_status) IN ('M', 'W') and (struct(cd_marital_status,cd_education_status)) IN (const struct('M','Unknown'), const struct('W','Advanced Degree')) and cd_demo_sk is not null) (type: boolean)
Statistics: Num rows: 930900 Data size: 358593079 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string)
@@ -166,7 +185,7 @@ STAGE PLANS:
Statistics: Num rows: 930900 Data size: 358593079 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string), _col2 (type: string)
Execution mode: vectorized
- Map 12
+ Map 8
Map Operator Tree:
TableScan
alias: customer_address
@@ -185,44 +204,25 @@ STAGE PLANS:
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 20000000 Data size: 20297597642 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
- Map 6
+ Map 9
Map Operator Tree:
TableScan
- alias: date_dim
- filterExpr: ((d_year = 1999) and (d_moy = 11) and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
+ alias: catalog_returns
+ filterExpr: (cr_call_center_sk is not null and cr_returned_date_sk is not null and cr_returning_customer_sk is not null) (type: boolean)
+ Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
Filter Operator
- predicate: ((d_moy = 11) and (d_year = 1999) and d_date_sk is not null) (type: boolean)
- Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
+ predicate: (cr_call_center_sk is not null and cr_returned_date_sk is not null and cr_returning_customer_sk is not null) (type: boolean)
+ Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: d_date_sk (type: int)
- outputColumnNames: _col0
- Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
+ expressions: cr_returned_date_sk (type: int), cr_returning_customer_sk (type: int), cr_call_center_sk (type: int), cr_net_loss (type: decimal(7,2))
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
- Execution mode: vectorized
- Map 8
- Map Operator Tree:
- TableScan
- alias: customer
- filterExpr: (c_customer_sk is not null and c_current_addr_sk is not null and c_current_cdemo_sk is not null and c_current_hdemo_sk is not null) (type: boolean)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- Filter Operator
- predicate: (c_current_addr_sk is not null and c_current_cdemo_sk is not null and c_current_hdemo_sk is not null and c_customer_sk is not null) (type: boolean)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: c_customer_sk (type: int), c_current_cdemo_sk (type: int), c_current_hdemo_sk (type: int), c_current_addr_sk (type: int)
- outputColumnNames: _col0, _col1, _col2, _col3
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col1 (type: int)
- sort order: +
- Map-reduce partition columns: _col1 (type: int)
- Statistics: Num rows: 80000000 Data size: 68801615852 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: int)
+ Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: decimal(7,2))
Execution mode: vectorized
Reducer 10
Local Work:
@@ -232,38 +232,6 @@ STAGE PLANS:
condition map:
Inner Join 0 to 1
keys:
- 0 _col3 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col0, _col2, _col5, _col6
- Statistics: Num rows: 96800003 Data size: 83249958789 Basic stats: COMPLETE Column stats: NONE
- Map Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col2 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col0, _col5, _col6
- input vertices:
- 1 Map 13
- Statistics: Num rows: 106480005 Data size: 91574956652 Basic stats: COMPLETE Column stats: NONE
- Select Operator
- expressions: _col0 (type: int), _col5 (type: string), _col6 (type: string)
- outputColumnNames: _col2, _col7, _col8
- Statistics: Num rows: 106480005 Data size: 91574956652 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col2 (type: int)
- sort order: +
- Map-reduce partition columns: _col2 (type: int)
- Statistics: Num rows: 106480005 Data size: 91574956652 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col7 (type: string), _col8 (type: string)
- Reducer 2
- Local Work:
- Map Reduce Local Work
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
0 _col0 (type: int)
1 _col0 (type: int)
outputColumnNames: _col1, _col2, _col3
@@ -276,7 +244,7 @@ STAGE PLANS:
1 _col0 (type: int)
outputColumnNames: _col1, _col3, _col8, _col9, _col10
input vertices:
- 1 Map 7
+ 1 Map 12
Statistics: Num rows: 34846646 Data size: 3699254122 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: int)
@@ -284,29 +252,73 @@ STAGE PLANS:
Map-reduce partition columns: _col1 (type: int)
Statistics: Num rows: 34846646 Data size: 3699254122 Basic stats: COMPLETE Column stats: NONE
value expressions: _col3 (type: decimal(7,2)), _col8 (type: string), _col9 (type: string), _col10 (type: string)
- Reducer 3
+ Reducer 2
Reduce Operator Tree:
Join Operator
condition map:
Inner Join 0 to 1
keys:
0 _col1 (type: int)
- 1 _col2 (type: int)
- outputColumnNames: _col3, _col8, _col9, _col10, _col18, _col19
- Statistics: Num rows: 117128008 Data size: 100732454500 Basic stats: COMPLETE Column stats: NONE
- Group By Operator
- aggregations: sum(_col3)
- keys: _col8 (type: string), _col9 (type: string), _col10 (type: string), _col18 (type: string), _col19 (type: string)
- mode: hash
- outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ 1 _col0 (type: int)
+ outputColumnNames: _col0, _col2, _col3, _col5, _col6
+ Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col3 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col3 (type: int)
+ Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col0 (type: int), _col2 (type: int), _col5 (type: string), _col6 (type: string)
+ Reducer 3
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col3 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col0, _col2, _col5, _col6
+ Statistics: Num rows: 96800003 Data size: 83249958789 Basic stats: COMPLETE Column stats: NONE
+ Reduce Output Operator
+ key expressions: _col0 (type: int)
+ sort order: +
+ Map-reduce partition columns: _col0 (type: int)
+ Statistics: Num rows: 96800003 Data size: 83249958789 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col2 (type: int), _col5 (type: string), _col6 (type: string)
+ Reducer 4
+ Local Work:
+ Map Reduce Local Work
+ Reduce Operator Tree:
+ Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col0 (type: int)
+ 1 _col1 (type: int)
+ outputColumnNames: _col2, _col5, _col6, _col12, _col17, _col18, _col19
+ Statistics: Num rows: 106480005 Data size: 91574956652 Basic stats: COMPLETE Column stats: NONE
+ Map Join Operator
+ condition map:
+ Inner Join 0 to 1
+ keys:
+ 0 _col2 (type: int)
+ 1 _col0 (type: int)
+ outputColumnNames: _col5, _col6, _col12, _col17, _col18, _col19
+ input vertices:
+ 1 Map 13
Statistics: Num rows: 117128008 Data size: 100732454500 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
- sort order: +++++
- Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
+ Group By Operator
+ aggregations: sum(_col12)
+ keys: _col5 (type: string), _col6 (type: string), _col17 (type: string), _col18 (type: string), _col19 (type: string)
+ mode: hash
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 117128008 Data size: 100732454500 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col5 (type: decimal(17,2))
- Reducer 4
+ Reduce Output Operator
+ key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
+ sort order: +++++
+ Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col3 (type: string), _col4 (type: string)
+ Statistics: Num rows: 117128008 Data size: 100732454500 Basic stats: COMPLETE Column stats: NONE
+ value expressions: _col5 (type: decimal(17,2))
+ Reducer 5
Execution mode: vectorized
Reduce Operator Tree:
Group By Operator
@@ -316,7 +328,7 @@ STAGE PLANS:
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 58564004 Data size: 50366227250 Basic stats: COMPLETE Column stats: NONE
Select Operator
- expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string), _col5 (type: decimal(17,2))
+ expressions: _col2 (type: string), _col3 (type: string), _col4 (type: string), _col5 (type: decimal(17,2))
outputColumnNames: _col0, _col1, _col2, _col4
Statistics: Num rows: 58564004 Data size: 50366227250 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
@@ -324,7 +336,7 @@ STAGE PLANS:
sort order: -
Statistics: Num rows: 58564004 Data size: 50366227250 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string)
- Reducer 5
+ Reducer 6
Execution mode: vectorized
Reduce Operator Tree:
Select Operator
@@ -338,22 +350,6 @@ STAGE PLANS:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Reducer 9
- Reduce Operator Tree:
- Join Operator
- condition map:
- Inner Join 0 to 1
- keys:
- 0 _col1 (type: int)
- 1 _col0 (type: int)
- outputColumnNames: _col0, _col2, _col3, _col5, _col6
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
- Reduce Output Operator
- key expressions: _col3 (type: int)
- sort order: +
- Map-reduce partition columns: _col3 (type: int)
- Statistics: Num rows: 88000001 Data size: 75681779077 Basic stats: COMPLETE Column stats: NONE
- value expressions: _col0 (type: int), _col2 (type: int), _col5 (type: string), _col6 (type: string)
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/spark/query98.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query98.q.out b/ql/src/test/results/clientpositive/perf/spark/query98.q.out
index c82607d..f1d7caa 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query98.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query98.q.out
@@ -94,7 +94,7 @@ STAGE PLANS:
Stage: Stage-1
Spark
Edges:
- Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 440), Map 7 (PARTITION-LEVEL SORT, 440)
+ Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 442), Map 7 (PARTITION-LEVEL SORT, 442)
Reducer 3 <- Reducer 2 (GROUP, 481)
Reducer 4 <- Reducer 3 (PARTITION-LEVEL SORT, 241)
Reducer 5 <- Reducer 4 (SORT, 1)
@@ -140,16 +140,16 @@ STAGE PLANS:
Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: ((i_category) IN ('Jewelry', 'Sports', 'Books') and i_item_sk is not null) (type: boolean)
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: i_item_sk (type: int), i_item_id (type: string), i_item_desc (type: string), i_current_price (type: decimal(7,2)), i_class (type: string), i_category (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
- Statistics: Num rows: 231000 Data size: 331780228 Basic stats: COMPLETE Column stats: NONE
+ Statistics: Num rows: 462000 Data size: 663560457 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: decimal(7,2)), _col4 (type: string), _col5 (type: string)
Execution mode: vectorized
Reducer 2
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/tez/query10.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/tez/query10.q.out b/ql/src/test/results/clientpositive/perf/tez/query10.q.out
index a8f097f..5b55d44 100644
--- a/ql/src/test/results/clientpositive/perf/tez/query10.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/query10.q.out
@@ -195,9 +195,9 @@ Stage-0
<-Map 8 [SIMPLE_EDGE] vectorized
SHUFFLE [RS_179]
PartitionCols:_col0
- Select Operator [SEL_178] (rows=20000000 width=1014)
+ Select Operator [SEL_178] (rows=40000000 width=1014)
Output:["_col0"]
- Filter Operator [FIL_177] (rows=20000000 width=1014)
+ Filter Operator [FIL_177] (rows=40000000 width=1014)
predicate:((ca_county) IN ('Walker County', 'Richland County', 'Gaines County', 'Douglas County', 'Dona Ana County') and ca_address_sk is not null)
TableScan [TS_3] (rows=40000000 width=1014)
default@customer_address,ca,Tbl:COMPLETE,Col:NONE,Output:["ca_address_sk","ca_county"]
http://git-wip-us.apache.org/repos/asf/hive/blob/20c95c1c/ql/src/test/results/clientpositive/perf/tez/query12.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/tez/query12.q.out b/ql/src/test/results/clientpositive/perf/tez/query12.q.out
index d3d8df0..151bebf 100644
--- a/ql/src/test/results/clientpositive/perf/tez/query12.q.out
+++ b/ql/src/test/results/clientpositive/perf/tez/query12.q.out
@@ -107,9 +107,9 @@ Stage-0
<-Map 9 [SIMPLE_EDGE] vectorized
SHUFFLE [RS_69]
PartitionCols:_col0
- Select Operator [SEL_68] (rows=231000 width=1436)
+ Select Operator [SEL_68] (rows=462000 width=1436)
Output:["_col0","_col1","_col2","_col3","_col4","_col5"]
- Filter Operator [FIL_67] (rows=231000 width=1436)
+ Filter Operator [FIL_67] (rows=462000 width=1436)
predicate:((i_category) IN ('Jewelry', 'Sports', 'Books') and i_item_sk is not null)
TableScan [TS_6] (rows=462000 width=1436)
default@item,item,Tbl:COMPLETE,Col:NONE,Output:["i_item_sk","i_item_id","i_item_desc","i_current_price","i_class","i_category"]
@@ -144,7 +144,7 @@ Stage-0
SHUFFLE [RS_72]
Group By Operator [GBY_71] (rows=1 width=12)
Output:["_col0","_col1","_col2"],aggregations:["min(_col0)","max(_col0)","bloom_filter(_col0, expectedEntries=1000000)"]
- Select Operator [SEL_70] (rows=231000 width=1436)
+ Select Operator [SEL_70] (rows=462000 width=1436)
Output:["_col0"]
Please refer to the previous Select Operator [SEL_68]
<-Reducer 8 [BROADCAST_EDGE] vectorized