You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by mm...@apache.org on 2018/08/08 07:37:50 UTC
[27/51] [partial] hive git commit: HIVE-20315: Vectorization: Fix
more NULL / Wrong Results issues and avoid unnecessary casts/conversions
(Matt McCline, reviewed by Teddy Choi)
http://git-wip-us.apache.org/repos/asf/hive/blob/470ba3e2/ql/src/test/results/clientpositive/perf/spark/query49.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/perf/spark/query49.q.out b/ql/src/test/results/clientpositive/perf/spark/query49.q.out
index 16cc603..e10a925 100644
--- a/ql/src/test/results/clientpositive/perf/spark/query49.q.out
+++ b/ql/src/test/results/clientpositive/perf/spark/query49.q.out
@@ -1,4 +1,4 @@
-PREHOOK: query: explain
+PREHOOK: query: explain vectorization expression
select
'web' as channel
,web.item
@@ -124,7 +124,7 @@ select
order by 1,4,5
limit 100
PREHOOK: type: QUERY
-POSTHOOK: query: explain
+POSTHOOK: query: explain vectorization expression
select
'web' as channel
,web.item
@@ -250,6 +250,10 @@ select
order by 1,4,5
limit 100
POSTHOOK: type: QUERY
+PLAN VECTORIZATION:
+ enabled: true
+ enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
+
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -284,140 +288,306 @@ STAGE PLANS:
alias: ws
filterExpr: ((ws_net_profit > 1) and (ws_net_paid > 0) and (ws_quantity > 0) and ws_order_number is not null and ws_item_sk is not null and ws_sold_date_sk is not null) (type: boolean)
Statistics: Num rows: 144002668 Data size: 19580198212 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 33:decimal(7,2)/DECIMAL_64, val 100), FilterDecimal64ColGreaterDecimal64Scalar(col 29:decimal(7,2)/DECIMAL_64, val 0), FilterLongColGreaterLongScalar(col 18:int, val 0), SelectColumnIsNotNull(col 17:int), SelectColumnIsNotNull(col 3:int), SelectColumnIsNotNull(col 0:int))
predicate: ((ws_net_paid > 0) and (ws_net_profit > 1) and (ws_quantity > 0) and ws_item_sk is not null and ws_order_number is not null and ws_sold_date_sk is not null) (type: boolean)
Statistics: Num rows: 5333432 Data size: 725192506 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: ws_sold_date_sk (type: int), ws_item_sk (type: int), ws_order_number (type: int), ws_quantity (type: int), ws_net_paid (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 3, 17, 18, 29]
Statistics: Num rows: 5333432 Data size: 725192506 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkLongOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 5333432 Data size: 725192506 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 10
Map Operator Tree:
TableScan
alias: date_dim
filterExpr: ((d_year = 2000) and (d_moy = 12) and d_date_sk is not null) (type: boolean)
Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterLongColEqualLongScalar(col 6:int, val 2000), FilterLongColEqualLongScalar(col 8:int, val 12), SelectColumnIsNotNull(col 0:int))
predicate: ((d_moy = 12) and (d_year = 2000) and d_date_sk is not null) (type: boolean)
Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: d_date_sk (type: int)
outputColumnNames: _col0
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0]
Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkLongOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 11
Map Operator Tree:
TableScan
alias: wr
filterExpr: ((wr_return_amt > 10000) and wr_order_number is not null and wr_item_sk is not null) (type: boolean)
Statistics: Num rows: 14398467 Data size: 1325194184 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 15:decimal(7,2)/DECIMAL_64, val 1000000), SelectColumnIsNotNull(col 13:int), SelectColumnIsNotNull(col 2:int))
predicate: ((wr_return_amt > 10000) and wr_item_sk is not null and wr_order_number is not null) (type: boolean)
Statistics: Num rows: 4799489 Data size: 441731394 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: wr_item_sk (type: int), wr_order_number (type: int), wr_return_quantity (type: int), wr_return_amt (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 13, 14, 15]
Statistics: Num rows: 4799489 Data size: 441731394 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int), _col1 (type: int)
sort order: ++
Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 4799489 Data size: 441731394 Basic stats: COMPLETE Column stats: NONE
value expressions: _col2 (type: int), _col3 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 12
Map Operator Tree:
TableScan
alias: cs
filterExpr: ((cs_net_profit > 1) and (cs_net_paid > 0) and (cs_quantity > 0) and cs_order_number is not null and cs_item_sk is not null and cs_sold_date_sk is not null) (type: boolean)
Statistics: Num rows: 287989836 Data size: 38999608952 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 33:decimal(7,2)/DECIMAL_64, val 100), FilterDecimal64ColGreaterDecimal64Scalar(col 29:decimal(7,2)/DECIMAL_64, val 0), FilterLongColGreaterLongScalar(col 18:int, val 0), SelectColumnIsNotNull(col 17:int), SelectColumnIsNotNull(col 15:int), SelectColumnIsNotNull(col 0:int))
predicate: ((cs_net_paid > 0) and (cs_net_profit > 1) and (cs_quantity > 0) and cs_item_sk is not null and cs_order_number is not null and cs_sold_date_sk is not null) (type: boolean)
Statistics: Num rows: 10666290 Data size: 1444429931 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: cs_sold_date_sk (type: int), cs_item_sk (type: int), cs_order_number (type: int), cs_quantity (type: int), cs_net_paid (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 15, 17, 18, 29]
Statistics: Num rows: 10666290 Data size: 1444429931 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkLongOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 10666290 Data size: 1444429931 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 19
Map Operator Tree:
TableScan
alias: cr
filterExpr: ((cr_return_amount > 10000) and cr_order_number is not null and cr_item_sk is not null) (type: boolean)
Statistics: Num rows: 28798881 Data size: 3057234680 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 18:decimal(7,2)/DECIMAL_64, val 1000000), SelectColumnIsNotNull(col 16:int), SelectColumnIsNotNull(col 2:int))
predicate: ((cr_return_amount > 10000) and cr_item_sk is not null and cr_order_number is not null) (type: boolean)
Statistics: Num rows: 9599627 Data size: 1019078226 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: cr_item_sk (type: int), cr_order_number (type: int), cr_return_quantity (type: int), cr_return_amount (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 16, 17, 18]
Statistics: Num rows: 9599627 Data size: 1019078226 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int), _col1 (type: int)
sort order: ++
Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 9599627 Data size: 1019078226 Basic stats: COMPLETE Column stats: NONE
value expressions: _col2 (type: int), _col3 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 20
Map Operator Tree:
TableScan
alias: sts
filterExpr: ((ss_net_profit > 1) and (ss_net_paid > 0) and (ss_quantity > 0) and ss_ticket_number is not null and ss_item_sk is not null and ss_sold_date_sk is not null) (type: boolean)
Statistics: Num rows: 575995635 Data size: 50814502088 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 22:decimal(7,2)/DECIMAL_64, val 100), FilterDecimal64ColGreaterDecimal64Scalar(col 20:decimal(7,2)/DECIMAL_64, val 0), FilterLongColGreaterLongScalar(col 10:int, val 0), SelectColumnIsNotNull(col 9:int), SelectColumnIsNotNull(col 2:int), SelectColumnIsNotNull(col 0:int))
predicate: ((ss_net_paid > 0) and (ss_net_profit > 1) and (ss_quantity > 0) and ss_item_sk is not null and ss_sold_date_sk is not null and ss_ticket_number is not null) (type: boolean)
Statistics: Num rows: 21333171 Data size: 1882018537 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: ss_sold_date_sk (type: int), ss_item_sk (type: int), ss_ticket_number (type: int), ss_quantity (type: int), ss_net_paid (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 2, 9, 10, 20]
Statistics: Num rows: 21333171 Data size: 1882018537 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkLongOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 21333171 Data size: 1882018537 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Map 27
Map Operator Tree:
TableScan
alias: sr
filterExpr: ((sr_return_amt > 10000) and sr_ticket_number is not null and sr_item_sk is not null) (type: boolean)
Statistics: Num rows: 57591150 Data size: 4462194832 Basic stats: COMPLETE Column stats: NONE
+ TableScan Vectorization:
+ native: true
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprAndExpr(children: FilterDecimal64ColGreaterDecimal64Scalar(col 11:decimal(7,2)/DECIMAL_64, val 1000000), SelectColumnIsNotNull(col 9:int), SelectColumnIsNotNull(col 2:int))
predicate: ((sr_return_amt > 10000) and sr_item_sk is not null and sr_ticket_number is not null) (type: boolean)
Statistics: Num rows: 19197050 Data size: 1487398277 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: sr_item_sk (type: int), sr_ticket_number (type: int), sr_return_quantity (type: int), sr_return_amt (type: decimal(7,2))
outputColumnNames: _col0, _col1, _col2, _col3
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 9, 10, 11]
Statistics: Num rows: 19197050 Data size: 1487398277 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: int), _col1 (type: int)
sort order: ++
Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 19197050 Data size: 1487398277 Basic stats: COMPLETE Column stats: NONE
value expressions: _col2 (type: int), _col3 (type: decimal(7,2))
Execution mode: vectorized
+ Map Vectorization:
+ enabled: true
+ enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
+ inputFormatFeatureSupport: [DECIMAL_64]
+ featureSupportInUse: [DECIMAL_64]
+ inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reducer 13
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -434,6 +604,11 @@ STAGE PLANS:
Statistics: Num rows: 11732919 Data size: 1588872958 Basic stats: COMPLETE Column stats: NONE
value expressions: _col3 (type: int), _col4 (type: decimal(7,2))
Reducer 14
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -461,9 +636,23 @@ STAGE PLANS:
value expressions: _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 15
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), sum(VALUE._col1), sum(VALUE._col2), sum(VALUE._col3)
+ Group By Vectorization:
+ aggregators: VectorUDAFSumLong(col 1:bigint) -> bigint, VectorUDAFSumLong(col 2:bigint) -> bigint, VectorUDAFSumDecimal(col 3:decimal(22,2)) -> decimal(22,2), VectorUDAFSumDecimal(col 4:decimal(22,2)) -> decimal(22,2)
+ className: VectorGroupByOperator
+ groupByMode: MERGEPARTIAL
+ keyExpressions: col 0:int
+ native: false
+ vectorProcessingMode: MERGE_PARTIAL
+ projectedOutputColumnNums: [0, 1, 2, 3]
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -472,14 +661,29 @@ STAGE PLANS:
key expressions: 0 (type: int), (CAST( _col1 AS decimal(15,4)) / CAST( _col2 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 5:int, DecimalColDivideDecimalColumn(col 6:decimal(15,4), col 7:decimal(15,4))(children: CastLongToDecimal(col 1:bigint) -> 6:decimal(15,4), CastLongToDecimal(col 2:bigint) -> 7:decimal(15,4)) -> 8:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 16
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: bigint), VALUE._col2 (type: bigint), VALUE._col3 (type: decimal(22,2)), VALUE._col4 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6]
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -501,23 +705,50 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 12:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 11:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 8:int]
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
outputColumnNames: rank_window_0, _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [7, 2, 3, 4, 5, 6]
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: 0 (type: int), (CAST( _col3 AS decimal(15,4)) / CAST( _col4 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 13:int, DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastDecimalToDecimal(col 5:decimal(22,2)) -> 9:decimal(15,4), CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4)) -> 14:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
value expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 17
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: int), VALUE._col2 (type: bigint), VALUE._col3 (type: bigint), VALUE._col4 (type: decimal(22,2)), VALUE._col5 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6, 7]
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -538,15 +769,39 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 13:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 12:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 9:int]
Statistics: Num rows: 6453105 Data size: 873880077 Basic stats: COMPLETE Column stats: NONE
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprOrExpr(children: FilterLongColLessEqualLongScalar(col 2:int, val 10), FilterLongColLessEqualLongScalar(col 8:int, val 10))
predicate: ((_col0 <= 10) or (rank_window_1 <= 10)) (type: boolean)
Statistics: Num rows: 4302070 Data size: 582586718 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: 'catalog' (type: string), _col1 (type: int), (CAST( _col2 AS decimal(15,4)) / CAST( _col3 AS decimal(15,4))) (type: decimal(35,20)), _col0 (type: int), rank_window_1 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [14, 3, 15, 2, 8]
+ selectExpressions: ConstantVectorExpression(val catalog) -> 14:string, DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4), CastLongToDecimal(col 5:bigint) -> 11:decimal(15,4)) -> 15:decimal(35,20)
Statistics: Num rows: 4302070 Data size: 582586718 Basic stats: COMPLETE Column stats: NONE
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: HASH
+ keyExpressions: col 14:string, col 2:int, col 8:int, col 3:int, col 15:decimal(35,20)
+ native: false
+ vectorProcessingMode: HASH
+ projectedOutputColumnNums: []
keys: _col0 (type: string), _col3 (type: int), _col4 (type: int), _col1 (type: int), _col2 (type: decimal(35,20))
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -555,8 +810,17 @@ STAGE PLANS:
key expressions: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
sort order: +++++
Map-reduce partition columns: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 6453220 Data size: 875080950 Basic stats: COMPLETE Column stats: NONE
Reducer 2
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -573,6 +837,11 @@ STAGE PLANS:
Statistics: Num rows: 5866775 Data size: 797711773 Basic stats: COMPLETE Column stats: NONE
value expressions: _col3 (type: int), _col4 (type: decimal(7,2))
Reducer 21
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -589,6 +858,11 @@ STAGE PLANS:
Statistics: Num rows: 23466488 Data size: 2070220435 Basic stats: COMPLETE Column stats: NONE
value expressions: _col3 (type: int), _col4 (type: decimal(7,2))
Reducer 22
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -616,9 +890,23 @@ STAGE PLANS:
value expressions: _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 23
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), sum(VALUE._col1), sum(VALUE._col2), sum(VALUE._col3)
+ Group By Vectorization:
+ aggregators: VectorUDAFSumLong(col 1:bigint) -> bigint, VectorUDAFSumLong(col 2:bigint) -> bigint, VectorUDAFSumDecimal(col 3:decimal(22,2)) -> decimal(22,2), VectorUDAFSumDecimal(col 4:decimal(22,2)) -> decimal(22,2)
+ className: VectorGroupByOperator
+ groupByMode: MERGEPARTIAL
+ keyExpressions: col 0:int
+ native: false
+ vectorProcessingMode: MERGE_PARTIAL
+ projectedOutputColumnNums: [0, 1, 2, 3]
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -627,14 +915,29 @@ STAGE PLANS:
key expressions: 0 (type: int), (CAST( _col1 AS decimal(15,4)) / CAST( _col2 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 5:int, DecimalColDivideDecimalColumn(col 6:decimal(15,4), col 7:decimal(15,4))(children: CastLongToDecimal(col 1:bigint) -> 6:decimal(15,4), CastLongToDecimal(col 2:bigint) -> 7:decimal(15,4)) -> 8:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 24
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: bigint), VALUE._col2 (type: bigint), VALUE._col3 (type: decimal(22,2)), VALUE._col4 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6]
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -656,23 +959,50 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 12:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 11:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 8:int]
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
outputColumnNames: rank_window_0, _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [7, 2, 3, 4, 5, 6]
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: 0 (type: int), (CAST( _col3 AS decimal(15,4)) / CAST( _col4 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 13:int, DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastDecimalToDecimal(col 5:decimal(22,2)) -> 9:decimal(15,4), CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4)) -> 14:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
value expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 25
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: int), VALUE._col2 (type: bigint), VALUE._col3 (type: bigint), VALUE._col4 (type: decimal(22,2)), VALUE._col5 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6, 7]
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -693,15 +1023,39 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 13:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 12:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 9:int]
Statistics: Num rows: 12906568 Data size: 1138621219 Basic stats: COMPLETE Column stats: NONE
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprOrExpr(children: FilterLongColLessEqualLongScalar(col 2:int, val 10), FilterLongColLessEqualLongScalar(col 8:int, val 10))
predicate: ((_col0 <= 10) or (rank_window_1 <= 10)) (type: boolean)
Statistics: Num rows: 8604378 Data size: 759080753 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: 'store' (type: string), _col1 (type: int), (CAST( _col2 AS decimal(15,4)) / CAST( _col3 AS decimal(15,4))) (type: decimal(35,20)), _col0 (type: int), rank_window_1 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [14, 3, 15, 2, 8]
+ selectExpressions: ConstantVectorExpression(val store) -> 14:string, DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4), CastLongToDecimal(col 5:bigint) -> 11:decimal(15,4)) -> 15:decimal(35,20)
Statistics: Num rows: 8604378 Data size: 759080753 Basic stats: COMPLETE Column stats: NONE
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: HASH
+ keyExpressions: col 14:string, col 2:int, col 8:int, col 3:int, col 15:decimal(35,20)
+ native: false
+ vectorProcessingMode: HASH
+ projectedOutputColumnNums: []
keys: _col0 (type: string), _col3 (type: int), _col4 (type: int), _col1 (type: int), _col2 (type: decimal(35,20))
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -710,9 +1064,18 @@ STAGE PLANS:
key expressions: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
sort order: +++++
Map-reduce partition columns: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 11830988 Data size: 1196621228 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
Reducer 3
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ notVectorizedReason: Tagging not supported
+ vectorized: false
Reduce Operator Tree:
Join Operator
condition map:
@@ -740,9 +1103,23 @@ STAGE PLANS:
value expressions: _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 4
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0), sum(VALUE._col1), sum(VALUE._col2), sum(VALUE._col3)
+ Group By Vectorization:
+ aggregators: VectorUDAFSumLong(col 1:bigint) -> bigint, VectorUDAFSumLong(col 2:bigint) -> bigint, VectorUDAFSumDecimal(col 3:decimal(22,2)) -> decimal(22,2), VectorUDAFSumDecimal(col 4:decimal(22,2)) -> decimal(22,2)
+ className: VectorGroupByOperator
+ groupByMode: MERGEPARTIAL
+ keyExpressions: col 0:int
+ native: false
+ vectorProcessingMode: MERGE_PARTIAL
+ projectedOutputColumnNums: [0, 1, 2, 3]
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -751,14 +1128,29 @@ STAGE PLANS:
key expressions: 0 (type: int), (CAST( _col1 AS decimal(15,4)) / CAST( _col2 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 5:int, DecimalColDivideDecimalColumn(col 6:decimal(15,4), col 7:decimal(15,4))(children: CastLongToDecimal(col 1:bigint) -> 6:decimal(15,4), CastLongToDecimal(col 2:bigint) -> 7:decimal(15,4)) -> 8:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
value expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 5
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: true
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: bigint), VALUE._col2 (type: bigint), VALUE._col3 (type: decimal(22,2)), VALUE._col4 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6]
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -780,23 +1172,50 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 12:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastLongToDecimal(col 3:bigint) -> 9:decimal(15,4), CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4)) -> 11:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 8:int]
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
outputColumnNames: rank_window_0, _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [7, 2, 3, 4, 5, 6]
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: 0 (type: int), (CAST( _col3 AS decimal(15,4)) / CAST( _col4 AS decimal(15,4))) (type: decimal(35,20))
sort order: ++
Map-reduce partition columns: 0 (type: int)
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ keyExpressions: ConstantVectorExpression(val 0) -> 13:int, DecimalColDivideDecimalColumn(col 9:decimal(15,4), col 10:decimal(15,4))(children: CastDecimalToDecimal(col 5:decimal(22,2)) -> 9:decimal(15,4), CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4)) -> 14:decimal(35,20)
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
value expressions: rank_window_0 (type: int), _col0 (type: int), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: decimal(22,2)), _col4 (type: decimal(22,2))
Reducer 6
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: VALUE._col0 (type: int), VALUE._col1 (type: int), VALUE._col2 (type: bigint), VALUE._col3 (type: bigint), VALUE._col4 (type: decimal(22,2)), VALUE._col5 (type: decimal(22,2))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [2, 3, 4, 5, 6, 7]
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
PTF Operator
Function definitions:
@@ -817,15 +1236,39 @@ STAGE PLANS:
window function: GenericUDAFRankEvaluator
window frame: ROWS PRECEDING(MAX)~FOLLOWING(MAX)
isPivotResult: true
+ PTF Vectorization:
+ className: VectorPTFOperator
+ evaluatorClasses: [VectorPTFEvaluatorRank]
+ functionInputExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 13:decimal(35,20)]
+ functionNames: [rank]
+ native: true
+ orderExpressions: [DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastDecimalToDecimal(col 6:decimal(22,2)) -> 10:decimal(15,4), CastDecimalToDecimal(col 7:decimal(22,2)) -> 11:decimal(15,4)) -> 12:decimal(35,20)]
+ partitionExpressions: [ConstantVectorExpression(val 0) -> 9:int]
Statistics: Num rows: 3226726 Data size: 438741484 Basic stats: COMPLETE Column stats: NONE
Filter Operator
+ Filter Vectorization:
+ className: VectorFilterOperator
+ native: true
+ predicateExpression: FilterExprOrExpr(children: FilterLongColLessEqualLongScalar(col 2:int, val 10), FilterLongColLessEqualLongScalar(col 8:int, val 10))
predicate: ((_col0 <= 10) or (rank_window_1 <= 10)) (type: boolean)
Statistics: Num rows: 2151150 Data size: 292494232 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: 'web' (type: string), _col1 (type: int), (CAST( _col2 AS decimal(15,4)) / CAST( _col3 AS decimal(15,4))) (type: decimal(35,20)), _col0 (type: int), rank_window_1 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [14, 3, 15, 2, 8]
+ selectExpressions: ConstantVectorExpression(val web) -> 14:string, DecimalColDivideDecimalColumn(col 10:decimal(15,4), col 11:decimal(15,4))(children: CastLongToDecimal(col 4:bigint) -> 10:decimal(15,4), CastLongToDecimal(col 5:bigint) -> 11:decimal(15,4)) -> 15:decimal(35,20)
Statistics: Num rows: 2151150 Data size: 292494232 Basic stats: COMPLETE Column stats: NONE
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: HASH
+ keyExpressions: col 14:string, col 2:int, col 8:int, col 3:int, col 15:decimal(35,20)
+ native: false
+ vectorProcessingMode: HASH
+ projectedOutputColumnNums: []
keys: _col0 (type: string), _col3 (type: int), _col4 (type: int), _col1 (type: int), _col2 (type: decimal(35,20))
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -834,11 +1277,28 @@ STAGE PLANS:
key expressions: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
sort order: +++++
Map-reduce partition columns: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 6453220 Data size: 875080950 Basic stats: COMPLETE Column stats: NONE
Reducer 7
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: MERGEPARTIAL
+ keyExpressions: col 0:string, col 1:int, col 2:int, col 3:int, col 4:decimal(35,20)
+ native: false
+ vectorProcessingMode: MERGE_PARTIAL
+ projectedOutputColumnNums: []
keys: KEY._col0 (type: string), KEY._col1 (type: int), KEY._col2 (type: int), KEY._col3 (type: int), KEY._col4 (type: decimal(35,20))
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -846,8 +1306,19 @@ STAGE PLANS:
Select Operator
expressions: _col0 (type: string), _col3 (type: int), _col4 (type: decimal(35,20)), _col1 (type: int), _col2 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 3, 4, 1, 2]
Statistics: Num rows: 3226610 Data size: 437540475 Basic stats: COMPLETE Column stats: NONE
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: HASH
+ keyExpressions: col 0:string, col 1:int, col 2:int, col 3:int, col 4:decimal(35,20)
+ native: false
+ vectorProcessingMode: HASH
+ projectedOutputColumnNums: []
keys: _col0 (type: string), _col3 (type: int), _col4 (type: int), _col1 (type: int), _col2 (type: decimal(35,20))
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -856,12 +1327,29 @@ STAGE PLANS:
key expressions: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
sort order: +++++
Map-reduce partition columns: _col0 (type: string), _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: decimal(35,20))
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkMultiKeyOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 11830988 Data size: 1196621228 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
Reducer 8
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Group By Operator
+ Group By Vectorization:
+ className: VectorGroupByOperator
+ groupByMode: MERGEPARTIAL
+ keyExpressions: col 0:string, col 1:int, col 2:int, col 3:int, col 4:decimal(35,20)
+ native: false
+ vectorProcessingMode: MERGE_PARTIAL
+ projectedOutputColumnNums: []
keys: KEY._col0 (type: string), KEY._col1 (type: int), KEY._col2 (type: int), KEY._col3 (type: int), KEY._col4 (type: decimal(35,20))
mode: mergepartial
outputColumnNames: _col0, _col1, _col2, _col3, _col4
@@ -869,25 +1357,49 @@ STAGE PLANS:
Select Operator
expressions: _col0 (type: string), _col3 (type: int), _col4 (type: decimal(35,20)), _col1 (type: int), _col2 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 3, 4, 1, 2]
Statistics: Num rows: 5915494 Data size: 598310614 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string), _col3 (type: int), _col4 (type: int)
sort order: +++
+ Reduce Sink Vectorization:
+ className: VectorReduceSinkObjectHashOperator
+ native: true
+ nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 5915494 Data size: 598310614 Basic stats: COMPLETE Column stats: NONE
TopN Hash Memory Usage: 0.1
value expressions: _col1 (type: int), _col2 (type: decimal(35,20))
Reducer 9
Execution mode: vectorized
+ Reduce Vectorization:
+ enabled: true
+ enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true
+ allNative: false
+ usesVectorUDFAdaptor: false
+ vectorized: true
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: int), VALUE._col1 (type: decimal(35,20)), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: int)
outputColumnNames: _col0, _col1, _col2, _col3, _col4
+ Select Vectorization:
+ className: VectorSelectOperator
+ native: true
+ projectedOutputColumnNums: [0, 3, 4, 1, 2]
Statistics: Num rows: 5915494 Data size: 598310614 Basic stats: COMPLETE Column stats: NONE
Limit
Number of rows: 100
+ Limit Vectorization:
+ className: VectorLimitOperator
+ native: true
Statistics: Num rows: 100 Data size: 10100 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
+ File Sink Vectorization:
+ className: VectorFileSinkOperator
+ native: false
Statistics: Num rows: 100 Data size: 10100 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat