You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by mm...@apache.org on 2016/10/17 20:36:03 UTC
[32/51] [partial] hive git commit: Revert "Revert "Revert
"HIVE-11394: Enhance EXPLAIN display for vectorization (Matt McCline,
reviewed by Gopal Vijayaraghavan)"""
http://git-wip-us.apache.org/repos/asf/hive/blob/ad6ce078/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out b/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out
index 2789664..c7897f7 100644
--- a/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out
+++ b/ql/src/test/results/clientpositive/llap/vector_coalesce.q.out
@@ -1,6 +1,6 @@
PREHOOK: query: -- SORT_QUERY_RESULTS
-EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c
+EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c
FROM alltypesorc
WHERE (cdouble IS NULL)
ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
@@ -8,16 +8,12 @@ LIMIT 10
PREHOOK: type: QUERY
POSTHOOK: query: -- SORT_QUERY_RESULTS
-EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c
+EXPLAIN SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c
FROM alltypesorc
WHERE (cdouble IS NULL)
ORDER BY cdouble, cstring1, cint, cfloat, csmallint, c
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -25,62 +21,53 @@ STAGE DEPENDENCIES:
STAGE PLANS:
Stage: Stage-1
Tez
+#### A masked pattern was here ####
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: SelectColumnIsNull(col 5) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [6, 2, 4, 1, 16]
- selectExpressions: VectorCoalesce(columns [12, 6, 13, 14, 15])(children: ConstantVectorExpression(val null) -> 12:string, col 6, CastLongToString(col 2) -> 13:String, VectorUDFAdaptor(null(cfloat)) -> 14:String, CastLongToString(col 1) -> 15:String) -> 16:string
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false
+ TableScan
+ alias: alltypesorc
+ Statistics: Num rows: 12288 Data size: 1045942 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: cdouble is null (type: boolean)
+ Statistics: Num rows: 3114 Data size: 265164 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: cstring1 (type: string), cint (type: int), cfloat (type: float), csmallint (type: smallint), COALESCE(null,cstring1,cint,cfloat,csmallint) (type: string)
+ outputColumnNames: _col1, _col2, _col3, _col4, _col5
+ Statistics: Num rows: 3114 Data size: 819540 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col1 (type: string), _col2 (type: int), _col3 (type: float), _col4 (type: smallint), _col5 (type: string)
+ sort order: +++++
+ Statistics: Num rows: 3114 Data size: 819540 Basic stats: COMPLETE Column stats: COMPLETE
+ TopN Hash Memory Usage: 0.1
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: true
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [5, 0, 1, 2, 3, 4]
- selectExpressions: ConstantVectorExpression(val null) -> 5:double
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
+ Select Operator
+ expressions: null (type: double), KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: float), KEY.reducesinkkey3 (type: smallint), KEY.reducesinkkey4 (type: string)
+ outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
+ Statistics: Num rows: 3114 Data size: 246572 Basic stats: COMPLETE Column stats: COMPLETE
+ Limit
+ Number of rows: 10
+ Statistics: Num rows: 10 Data size: 864 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 10 Data size: 864 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ ListSink
PREHOOK: query: SELECT cdouble, cstring1, cint, cfloat, csmallint, coalesce(cdouble, cstring1, cint, cfloat, csmallint) as c
FROM alltypesorc
@@ -108,22 +95,18 @@ NULL NULL -738306196 -51.0 NULL -738306196
NULL NULL -819152895 8.0 NULL -819152895
NULL NULL -827212561 8.0 NULL -827212561
NULL NULL -949587513 11.0 NULL -949587513
-PREHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT ctinyint, cdouble, cint, coalesce(ctinyint+10, (cdouble+log2(cint)), 0) as c
+PREHOOK: query: EXPLAIN SELECT ctinyint, cdouble, cint, coalesce(ctinyint+10, (cdouble+log2(cint)), 0) as c
FROM alltypesorc
WHERE (ctinyint IS NULL)
ORDER BY ctinyint, cdouble, cint, c
LIMIT 10
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT ctinyint, cdouble, cint, coalesce(ctinyint+10, (cdouble+log2(cint)), 0) as c
+POSTHOOK: query: EXPLAIN SELECT ctinyint, cdouble, cint, coalesce(ctinyint+10, (cdouble+log2(cint)), 0) as c
FROM alltypesorc
WHERE (ctinyint IS NULL)
ORDER BY ctinyint, cdouble, cint, c
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -131,62 +114,53 @@ STAGE DEPENDENCIES:
STAGE PLANS:
Stage: Stage-1
Tez
+#### A masked pattern was here ####
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: SelectColumnIsNull(col 0) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [5, 2, 15]
- selectExpressions: VectorCoalesce(columns [12, 14, 13])(children: ConstantVectorExpression(val null) -> 12:double, DoubleColAddDoubleColumn(col 5, col 13)(children: FuncLog2LongToDouble(col 2) -> 13:double) -> 14:double, ConstantVectorExpression(val 0.0) -> 13:double) -> 15:double
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false
+ TableScan
+ alias: alltypesorc
+ Statistics: Num rows: 12288 Data size: 146792 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: ctinyint is null (type: boolean)
+ Statistics: Num rows: 3115 Data size: 37224 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: cdouble (type: double), cint (type: int), COALESCE(null,(cdouble + log2(cint)),0) (type: double)
+ outputColumnNames: _col1, _col2, _col3
+ Statistics: Num rows: 3115 Data size: 52844 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col1 (type: double), _col2 (type: int), _col3 (type: double)
+ sort order: +++
+ Statistics: Num rows: 3115 Data size: 52844 Basic stats: COMPLETE Column stats: COMPLETE
+ TopN Hash Memory Usage: 0.1
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [3, 0, 1, 2]
- selectExpressions: ConstantVectorExpression(val null) -> 3:tinyint
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
+ Select Operator
+ expressions: null (type: tinyint), KEY.reducesinkkey0 (type: double), KEY.reducesinkkey1 (type: int), KEY.reducesinkkey2 (type: double)
+ outputColumnNames: _col0, _col1, _col2, _col3
+ Statistics: Num rows: 3115 Data size: 27928 Basic stats: COMPLETE Column stats: COMPLETE
+ Limit
+ Number of rows: 10
+ Statistics: Num rows: 10 Data size: 104 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 10 Data size: 104 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ ListSink
PREHOOK: query: SELECT ctinyint, cdouble, cint, coalesce(ctinyint+10, (cdouble+log2(cint)), 0) as c
FROM alltypesorc
@@ -214,22 +188,18 @@ NULL NULL -850295959 0.0
NULL NULL -886426182 0.0
NULL NULL -899422227 0.0
NULL NULL -971543377 0.0
-PREHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cfloat, cbigint, coalesce(cfloat, cbigint, 0) as c
+PREHOOK: query: EXPLAIN SELECT cfloat, cbigint, coalesce(cfloat, cbigint, 0) as c
FROM alltypesorc
WHERE (cfloat IS NULL AND cbigint IS NULL)
ORDER BY cfloat, cbigint, c
LIMIT 10
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cfloat, cbigint, coalesce(cfloat, cbigint, 0) as c
+POSTHOOK: query: EXPLAIN SELECT cfloat, cbigint, coalesce(cfloat, cbigint, 0) as c
FROM alltypesorc
WHERE (cfloat IS NULL AND cbigint IS NULL)
ORDER BY cfloat, cbigint, c
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -237,61 +207,50 @@ STAGE DEPENDENCIES:
STAGE PLANS:
Stage: Stage-1
Tez
+#### A masked pattern was here ####
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: FilterExprAndExpr(children: SelectColumnIsNull(col 4) -> boolean, SelectColumnIsNull(col 3) -> boolean) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: []
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false
+ TableScan
+ alias: alltypesorc
+ Statistics: Num rows: 12288 Data size: 110088 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (cfloat is null and cbigint is null) (type: boolean)
+ Statistics: Num rows: 790 Data size: 7092 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ Statistics: Num rows: 790 Data size: 3172 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ sort order:
+ Statistics: Num rows: 790 Data size: 3172 Basic stats: COMPLETE Column stats: COMPLETE
+ TopN Hash Memory Usage: 0.1
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2]
- selectExpressions: ConstantVectorExpression(val null) -> 0:float, ConstantVectorExpression(val null) -> 1:bigint, ConstantVectorExpression(val 0.0) -> 2:double
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
+ Select Operator
+ expressions: null (type: float), null (type: bigint), 0.0 (type: float)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 790 Data size: 3172 Basic stats: COMPLETE Column stats: COMPLETE
+ Limit
+ Number of rows: 10
+ Statistics: Num rows: 10 Data size: 52 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 10 Data size: 52 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ ListSink
PREHOOK: query: SELECT cfloat, cbigint, coalesce(cfloat, cbigint, 0) as c
FROM alltypesorc
@@ -319,22 +278,18 @@ NULL NULL 0.0
NULL NULL 0.0
NULL NULL 0.0
NULL NULL 0.0
-PREHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT ctimestamp1, ctimestamp2, coalesce(ctimestamp1, ctimestamp2) as c
+PREHOOK: query: EXPLAIN SELECT ctimestamp1, ctimestamp2, coalesce(ctimestamp1, ctimestamp2) as c
FROM alltypesorc
WHERE ctimestamp1 IS NOT NULL OR ctimestamp2 IS NOT NULL
ORDER BY ctimestamp1, ctimestamp2, c
LIMIT 10
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT ctimestamp1, ctimestamp2, coalesce(ctimestamp1, ctimestamp2) as c
+POSTHOOK: query: EXPLAIN SELECT ctimestamp1, ctimestamp2, coalesce(ctimestamp1, ctimestamp2) as c
FROM alltypesorc
WHERE ctimestamp1 IS NOT NULL OR ctimestamp2 IS NOT NULL
ORDER BY ctimestamp1, ctimestamp2, c
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -342,61 +297,53 @@ STAGE DEPENDENCIES:
STAGE PLANS:
Stage: Stage-1
Tez
+#### A masked pattern was here ####
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: FilterExprOrExpr(children: SelectColumnIsNotNull(col 8) -> boolean, SelectColumnIsNotNull(col 9) -> boolean) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [8, 9, 12]
- selectExpressions: VectorCoalesce(columns [8, 9])(children: col 8, col 9) -> 12:timestamp
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false
+ TableScan
+ alias: alltypesorc
+ Statistics: Num rows: 12288 Data size: 983040 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (ctimestamp1 is not null or ctimestamp2 is not null) (type: boolean)
+ Statistics: Num rows: 12288 Data size: 983040 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ expressions: ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), COALESCE(ctimestamp1,ctimestamp2) (type: timestamp)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 12288 Data size: 1474560 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ key expressions: _col0 (type: timestamp), _col1 (type: timestamp), _col2 (type: timestamp)
+ sort order: +++
+ Statistics: Num rows: 12288 Data size: 1474560 Basic stats: COMPLETE Column stats: COMPLETE
+ TopN Hash Memory Usage: 0.1
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2]
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
+ Select Operator
+ expressions: KEY.reducesinkkey0 (type: timestamp), KEY.reducesinkkey1 (type: timestamp), KEY.reducesinkkey2 (type: timestamp)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 12288 Data size: 1474560 Basic stats: COMPLETE Column stats: COMPLETE
+ Limit
+ Number of rows: 10
+ Statistics: Num rows: 10 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 10 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ ListSink
PREHOOK: query: SELECT ctimestamp1, ctimestamp2, coalesce(ctimestamp1, ctimestamp2) as c
FROM alltypesorc
@@ -424,22 +371,18 @@ NULL 1969-12-31 15:59:43.684 1969-12-31 15:59:43.684
NULL 1969-12-31 15:59:43.703 1969-12-31 15:59:43.703
NULL 1969-12-31 15:59:43.704 1969-12-31 15:59:43.704
NULL 1969-12-31 15:59:43.709 1969-12-31 15:59:43.709
-PREHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cfloat, cbigint, coalesce(cfloat, cbigint) as c
+PREHOOK: query: EXPLAIN SELECT cfloat, cbigint, coalesce(cfloat, cbigint) as c
FROM alltypesorc
WHERE (cfloat IS NULL AND cbigint IS NULL)
ORDER BY cfloat, cbigint, c
LIMIT 10
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cfloat, cbigint, coalesce(cfloat, cbigint) as c
+POSTHOOK: query: EXPLAIN SELECT cfloat, cbigint, coalesce(cfloat, cbigint) as c
FROM alltypesorc
WHERE (cfloat IS NULL AND cbigint IS NULL)
ORDER BY cfloat, cbigint, c
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -447,61 +390,50 @@ STAGE DEPENDENCIES:
STAGE PLANS:
Stage: Stage-1
Tez
+#### A masked pattern was here ####
Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
+#### A masked pattern was here ####
Vertices:
Map 1
Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: FilterExprAndExpr(children: SelectColumnIsNull(col 4) -> boolean, SelectColumnIsNull(col 3) -> boolean) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: []
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false
+ TableScan
+ alias: alltypesorc
+ Statistics: Num rows: 12288 Data size: 110088 Basic stats: COMPLETE Column stats: COMPLETE
+ Filter Operator
+ predicate: (cfloat is null and cbigint is null) (type: boolean)
+ Statistics: Num rows: 790 Data size: 7092 Basic stats: COMPLETE Column stats: COMPLETE
+ Select Operator
+ Statistics: Num rows: 790 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+ Reduce Output Operator
+ sort order:
+ Statistics: Num rows: 790 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+ TopN Hash Memory Usage: 0.1
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2]
- selectExpressions: ConstantVectorExpression(val null) -> 0:float, ConstantVectorExpression(val null) -> 1:bigint, ConstantVectorExpression(val null) -> 2:float
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
+ Select Operator
+ expressions: null (type: float), null (type: bigint), null (type: float)
+ outputColumnNames: _col0, _col1, _col2
+ Statistics: Num rows: 790 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+ Limit
+ Number of rows: 10
+ Statistics: Num rows: 10 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 10 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+ table:
+ input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ ListSink
PREHOOK: query: SELECT cfloat, cbigint, coalesce(cfloat, cbigint) as c
FROM alltypesorc
@@ -529,61 +461,34 @@ NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
NULL NULL NULL
-PREHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cbigint, ctinyint, coalesce(cbigint, ctinyint) as c
+PREHOOK: query: EXPLAIN SELECT cbigint, ctinyint, coalesce(cbigint, ctinyint) as c
FROM alltypesorc
WHERE cbigint IS NULL
LIMIT 10
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION ONLY EXPRESSION SELECT cbigint, ctinyint, coalesce(cbigint, ctinyint) as c
+POSTHOOK: query: EXPLAIN SELECT cbigint, ctinyint, coalesce(cbigint, ctinyint) as c
FROM alltypesorc
WHERE cbigint IS NULL
LIMIT 10
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
- Stage-1 is a root stage
- Stage-0 depends on stages: Stage-1
+ Stage-0 is a root stage
STAGE PLANS:
- Stage: Stage-1
- Tez
- Vertices:
- Map 1
- Map Operator Tree:
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
- Filter Vectorization:
- className: VectorFilterOperator
- native: true
- predicateExpression: SelectColumnIsNull(col 3) -> boolean
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [12, 0, 14]
- selectExpressions: ConstantVectorExpression(val null) -> 12:bigint, VectorCoalesce(columns [13, 0])(children: ConstantVectorExpression(val null) -> 13:tinyint, col 0) -> 14:tinyint
- Limit Vectorization:
- className: VectorLimitOperator
- native: true
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
- Execution mode: vectorized, llap
- LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
-
Stage: Stage-0
Fetch Operator
+ limit: 10
+ Processor Tree:
+ TableScan
+ alias: alltypesorc
+ Filter Operator
+ predicate: cbigint is null (type: boolean)
+ Select Operator
+ expressions: null (type: bigint), ctinyint (type: tinyint), COALESCE(null,ctinyint) (type: tinyint)
+ outputColumnNames: _col0, _col1, _col2
+ Limit
+ Number of rows: 10
+ ListSink
PREHOOK: query: SELECT cbigint, ctinyint, coalesce(cbigint, ctinyint) as c
FROM alltypesorc
http://git-wip-us.apache.org/repos/asf/hive/blob/ad6ce078/ql/src/test/results/clientpositive/llap/vector_coalesce_2.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/llap/vector_coalesce_2.q.out b/ql/src/test/results/clientpositive/llap/vector_coalesce_2.q.out
index 67f21d6..b390bfd 100644
--- a/ql/src/test/results/clientpositive/llap/vector_coalesce_2.q.out
+++ b/ql/src/test/results/clientpositive/llap/vector_coalesce_2.q.out
@@ -16,22 +16,18 @@ POSTHOOK: Input: default@values__tmp__table__1
POSTHOOK: Output: default@str_str_orc
POSTHOOK: Lineage: str_str_orc.str1 SIMPLE [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
POSTHOOK: Lineage: str_str_orc.str2 SIMPLE [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col2, type:string, comment:), ]
-PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+PREHOOK: query: EXPLAIN
SELECT
str2, ROUND(sum(cast(COALESCE(str1, 0) as int))/60, 2) as result
from str_str_orc
GROUP BY str2
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+POSTHOOK: query: EXPLAIN
SELECT
str2, ROUND(sum(cast(COALESCE(str1, 0) as int))/60, 2) as result
from str_str_orc
GROUP BY str2
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: false
- enabledConditionsNotMet: [hive.vectorized.execution.enabled IS false]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -110,18 +106,14 @@ POSTHOOK: Input: default@str_str_orc
#### A masked pattern was here ####
X 0.02
y 0.0
-PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+PREHOOK: query: EXPLAIN
SELECT COALESCE(str1, 0) as result
from str_str_orc
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+POSTHOOK: query: EXPLAIN
SELECT COALESCE(str1, 0) as result
from str_str_orc
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: false
- enabledConditionsNotMet: [hive.vectorized.execution.enabled IS false]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -170,22 +162,18 @@ POSTHOOK: Input: default@str_str_orc
0
1
0
-PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+PREHOOK: query: EXPLAIN
SELECT
str2, ROUND(sum(cast(COALESCE(str1, 0) as int))/60, 2) as result
from str_str_orc
GROUP BY str2
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+POSTHOOK: query: EXPLAIN
SELECT
str2, ROUND(sum(cast(COALESCE(str1, 0) as int))/60, 2) as result
from str_str_orc
GROUP BY str2
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -203,27 +191,12 @@ STAGE PLANS:
TableScan
alias: str_str_orc
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1]
Select Operator
expressions: str2 (type: string), UDFToInteger(COALESCE(str1,0)) (type: int)
outputColumnNames: _col0, _col1
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [1, 4]
- selectExpressions: VectorUDFAdaptor(UDFToInteger(COALESCE(str1,0)))(children: VectorCoalesce(columns [0, 2])(children: col 0, ConstantVectorExpression(val 0) -> 2:string) -> 3:string) -> 4:Long
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: sum(_col1)
- Group By Vectorization:
- aggregators: VectorUDAFSumLong(col 4) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- keyExpressions: col 1
- native: false
- projectedOutputColumns: [0]
keys: _col0 (type: string)
mode: hash
outputColumnNames: _col0, _col1
@@ -232,41 +205,15 @@ STAGE PLANS:
key expressions: _col0 (type: string)
sort order: +
Map-reduce partition columns: _col0 (type: string)
- Reduce Sink Vectorization:
- className: VectorReduceSinkStringOperator
- native: true
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, Uniform Hash IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
value expressions: _col1 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: true
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
Group By Operator
aggregations: sum(VALUE._col0)
- Group By Vectorization:
- aggregators: VectorUDAFSumLong(col 1) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- keyExpressions: col 0
- native: false
- projectedOutputColumns: [0]
keys: KEY._col0 (type: string)
mode: mergepartial
outputColumnNames: _col0, _col1
@@ -274,17 +221,9 @@ STAGE PLANS:
Select Operator
expressions: _col0 (type: string), round((UDFToDouble(_col1) / 60.0), 2) (type: double)
outputColumnNames: _col0, _col1
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 2]
- selectExpressions: RoundWithNumDigitsDoubleToDouble(col 3, decimalPlaces 2)(children: DoubleColDivideDoubleScalar(col 2, val 60.0)(children: CastLongToDouble(col 1) -> 2:double) -> 3:double) -> 2:double
Statistics: Num rows: 2 Data size: 255 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
Statistics: Num rows: 2 Data size: 255 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
@@ -313,18 +252,14 @@ POSTHOOK: Input: default@str_str_orc
#### A masked pattern was here ####
X 0.02
y 0.0
-PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+PREHOOK: query: EXPLAIN
SELECT COALESCE(str1, 0) as result
from str_str_orc
PREHOOK: type: QUERY
-POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION
+POSTHOOK: query: EXPLAIN
SELECT COALESCE(str1, 0) as result
from str_str_orc
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -339,23 +274,12 @@ STAGE PLANS:
TableScan
alias: str_str_orc
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1]
Select Operator
expressions: COALESCE(str1,0) (type: string)
outputColumnNames: _col0
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [3]
- selectExpressions: VectorCoalesce(columns [0, 2])(children: col 0, ConstantVectorExpression(val 0) -> 2:string) -> 3:string
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
Statistics: Num rows: 4 Data size: 510 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
@@ -363,14 +287,6 @@ STAGE PLANS:
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/ad6ce078/ql/src/test/results/clientpositive/llap/vector_complex_all.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/llap/vector_complex_all.q.out b/ql/src/test/results/clientpositive/llap/vector_complex_all.q.out
index 086b9ef..08d49bc 100644
--- a/ql/src/test/results/clientpositive/llap/vector_complex_all.q.out
+++ b/ql/src/test/results/clientpositive/llap/vector_complex_all.q.out
@@ -62,18 +62,14 @@ POSTHOOK: Lineage: orc_create_complex.str SIMPLE [(orc_create_staging)orc_create
POSTHOOK: Lineage: orc_create_complex.strct SIMPLE [(orc_create_staging)orc_create_staging.FieldSchema(name:strct, type:struct<A:string,B:string>, comment:null), ]
orc_create_staging.str orc_create_staging.mp orc_create_staging.lst orc_create_staging.strct
PREHOOK: query: -- Since complex types are not supported, this query should not vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT * FROM orc_create_complex
PREHOOK: type: QUERY
POSTHOOK: query: -- Since complex types are not supported, this query should not vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT * FROM orc_create_complex
POSTHOOK: type: QUERY
Explain
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -101,12 +97,6 @@ STAGE PLANS:
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: llap
LLAP IO: no inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- notVectorizedReason: Select expression for SELECT operator: Data type map<string,string> of Column[mp] not supported
- vectorized: false
Stage: Stage-0
Fetch Operator
@@ -127,18 +117,14 @@ line1 {"key13":"value13","key11":"value11","key12":"value12"} ["a","b","c"] {"a"
line2 {"key21":"value21","key22":"value22","key23":"value23"} ["d","e","f"] {"a":"three","b":"four"}
line3 {"key31":"value31","key32":"value32","key33":"value33"} ["g","h","i"] {"a":"five","b":"six"}
PREHOOK: query: -- However, since this query is not referencing the complex fields, it should vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT COUNT(*) FROM orc_create_complex
PREHOOK: type: QUERY
POSTHOOK: query: -- However, since this query is not referencing the complex fields, it should vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT COUNT(*) FROM orc_create_complex
POSTHOOK: type: QUERY
Explain
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -156,71 +142,29 @@ STAGE PLANS:
TableScan
alias: orc_create_complex
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: COMPLETE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: []
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: COMPLETE
Group By Operator
aggregations: count()
- Group By Vectorization:
- aggregators: VectorUDAFCountStar(*) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- native: false
- projectedOutputColumns: [0]
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
Reduce Output Operator
sort order:
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
value expressions: _col0 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0)
- Group By Vectorization:
- aggregators: VectorUDAFCountMerge(col 0) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- native: false
- projectedOutputColumns: [0]
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
File Output Operator
compressed: false
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
@@ -244,18 +188,14 @@ POSTHOOK: Input: default@orc_create_complex
c0
3
PREHOOK: query: -- Also, since this query is not referencing the complex fields, it should vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT str FROM orc_create_complex ORDER BY str
PREHOOK: type: QUERY
POSTHOOK: query: -- Also, since this query is not referencing the complex fields, it should vectorize.
-EXPLAIN VECTORIZATION EXPRESSION
+EXPLAIN
SELECT str FROM orc_create_complex ORDER BY str
POSTHOOK: type: QUERY
Explain
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -273,59 +213,25 @@ STAGE PLANS:
TableScan
alias: orc_create_complex
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
expressions: str (type: string)
outputColumnNames: _col0
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0]
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col0 (type: string)
sort order: +
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: vectorized, llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- groupByVectorOutput: true
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reduce Operator Tree:
Select Operator
expressions: KEY.reducesinkkey0 (type: string)
outputColumnNames: _col0
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0]
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
- File Sink Vectorization:
- className: VectorFileSinkOperator
- native: false
Statistics: Num rows: 3 Data size: 3177 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
http://git-wip-us.apache.org/repos/asf/hive/blob/ad6ce078/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out b/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out
index 99fd25f..97d5642 100644
--- a/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out
+++ b/ql/src/test/results/clientpositive/llap/vector_complex_join.q.out
@@ -21,17 +21,13 @@ POSTHOOK: Output: default@test
POSTHOOK: Lineage: test.a SIMPLE []
POSTHOOK: Lineage: test.b EXPRESSION []
c0 c1
-PREHOOK: query: explain vectorization expression
+PREHOOK: query: explain
select * from alltypesorc join test where alltypesorc.cint=test.a
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression
+POSTHOOK: query: explain
select * from alltypesorc join test where alltypesorc.cint=test.a
POSTHOOK: type: QUERY
Explain
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -75,12 +71,6 @@ STAGE PLANS:
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- notVectorizedReason: Small Table expression for MAPJOIN operator: Data type map<int,string> of Column[_col1] not supported
- vectorized: false
Map 2
Map Operator Tree:
TableScan
@@ -101,12 +91,6 @@ STAGE PLANS:
value expressions: _col1 (type: map<int,string>)
Execution mode: llap
LLAP IO: no inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- notVectorizedReason: Select expression for SELECT operator: Data type map<int,string> of Column[b] not supported
- vectorized: false
Stage: Stage-0
Fetch Operator
@@ -162,17 +146,13 @@ POSTHOOK: Input: default@values__tmp__table__1
POSTHOOK: Output: default@test2b
POSTHOOK: Lineage: test2b.a EXPRESSION [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ]
_col0
-PREHOOK: query: explain vectorization expression
+PREHOOK: query: explain
select * from test2b join test2a on test2b.a = test2a.a[1]
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression
+POSTHOOK: query: explain
select * from test2b join test2a on test2b.a = test2a.a[1]
POSTHOOK: type: QUERY
Explain
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -216,12 +196,6 @@ STAGE PLANS:
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- notVectorizedReason: Small Table expression for MAPJOIN operator: Data type array<int> of Column[a] not supported
- vectorized: false
Map 2
Map Operator Tree:
TableScan
@@ -238,12 +212,6 @@ STAGE PLANS:
value expressions: a (type: array<int>)
Execution mode: llap
LLAP IO: no inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- notVectorizedReason: Predicate expression for FILTER operator: UDF GenericUDFIndex(Column[a], Const int 1) not supported
- vectorized: false
Stage: Stage-0
Fetch Operator
http://git-wip-us.apache.org/repos/asf/hive/blob/ad6ce078/ql/src/test/results/clientpositive/llap/vector_count.q.out
----------------------------------------------------------------------
diff --git a/ql/src/test/results/clientpositive/llap/vector_count.q.out b/ql/src/test/results/clientpositive/llap/vector_count.q.out
index c425d8f..3b9d9f9 100644
--- a/ql/src/test/results/clientpositive/llap/vector_count.q.out
+++ b/ql/src/test/results/clientpositive/llap/vector_count.q.out
@@ -47,14 +47,10 @@ POSTHOOK: Input: default@abcd
12 100 75 7
12 NULL 80 2
NULL 35 23 6
-PREHOOK: query: explain vectorization expression select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
+PREHOOK: query: explain select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
+POSTHOOK: query: explain select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -72,26 +68,12 @@ STAGE PLANS:
TableScan
alias: abcd
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
expressions: a (type: int), b (type: int), c (type: int), d (type: int)
outputColumnNames: a, b, c, d
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(DISTINCT b), count(DISTINCT c), sum(d)
- Group By Vectorization:
- aggregators: VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 2) -> bigint, VectorUDAFSumLong(col 3) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- keyExpressions: col 0, col 1, col 2
- native: false
- projectedOutputColumns: [0, 1, 2]
keys: a (type: int), b (type: int), c (type: int)
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
@@ -100,30 +82,12 @@ STAGE PLANS:
key expressions: _col0 (type: int), _col1 (type: int), _col2 (type: int)
sort order: +++
Map-reduce partition columns: _col0 (type: int)
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false, No DISTINCT columns IS false
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
value expressions: _col5 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- notVectorizedReason: GROUPBY operator: DISTINCT not supported
- vectorized: false
Reduce Operator Tree:
Group By Operator
aggregations: count(DISTINCT KEY._col1:0._col0), count(DISTINCT KEY._col1:1._col0), sum(VALUE._col2)
@@ -157,14 +121,10 @@ POSTHOOK: Input: default@abcd
100 1 1 3
12 1 2 9
NULL 1 1 6
-PREHOOK: query: explain vectorization expression select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
+PREHOOK: query: explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
+POSTHOOK: query: explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -182,26 +142,12 @@ STAGE PLANS:
TableScan
alias: abcd
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
expressions: a (type: int), b (type: int), c (type: int), d (type: int)
outputColumnNames: _col1, _col2, _col3, _col4
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
Group By Operator
aggregations: count(1), count(), count(_col1), count(_col2), count(_col3), count(_col4), count(DISTINCT _col1), count(DISTINCT _col2), count(DISTINCT _col3), count(DISTINCT _col4), count(DISTINCT _col1, _col2), count(DISTINCT _col2, _col3), count(DISTINCT _col3, _col4), count(DISTINCT _col1, _col4), count(DISTINCT _col1, _col3), count(DISTINCT _col2, _col4), count(DISTINCT _col1, _col2, _col3), count(DISTINCT _col2, _col3, _col4), count(DISTINCT _col1, _col3, _col4), count(DISTINCT _col1, _col2, _col4), count(DISTINCT _col1, _col2, _col3, _col4)
- Group By Vectorization:
- aggregators: VectorUDAFCount(ConstantVectorExpression(val 1) -> 4:long) -> bigint, VectorUDAFCountStar(*) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 2) -> bigint, VectorUDAFCount(col 3) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 2) -> bigint, VectorUDAFCount(col 3) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 2) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 1) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 0) -> bigint, VectorUDAFCount(col 0) -> bigint
- className: VectorGroupByOperator
- vectorOutput: true
- keyExpressions: col 0, col 1, col 2, col 3
- native: false
- projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
keys: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int)
mode: hash
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24
@@ -209,30 +155,12 @@ STAGE PLANS:
Reduce Output Operator
key expressions: _col0 (type: int), _col1 (type: int), _col2 (type: int), _col3 (type: int)
sort order: ++++
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false, No DISTINCT columns IS false
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
value expressions: _col4 (type: bigint), _col5 (type: bigint), _col6 (type: bigint), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- notVectorizedReason: GROUPBY operator: DISTINCT not supported
- vectorized: false
Reduce Operator Tree:
Group By Operator
aggregations: count(VALUE._col0), count(VALUE._col1), count(VALUE._col2), count(VALUE._col3), count(VALUE._col4), count(VALUE._col5), count(DISTINCT KEY._col0:0._col0), count(DISTINCT KEY._col0:1._col0), count(DISTINCT KEY._col0:2._col0), count(DISTINCT KEY._col0:3._col0), count(DISTINCT KEY._col0:4._col0, KEY._col0:4._col1), count(DISTINCT KEY._col0:5._col0, KEY._col0:5._col1), count(DISTINCT KEY._col0:6._col0, KEY._col0:6._col1), count(DISTINCT KEY._col0:7._col0, KEY._col0:7._col1), count(DISTINCT KEY._col0:8._col0, KEY._col0:8._col1), count(DISTINCT KEY._col0:9._col0, KEY._col0:9._col1), count(DISTINCT KEY._col0:10._col0, KEY._col0:10._col1, KEY._col0:10._col2), count(DISTINCT KEY._col0:11._col0, KEY._col0:11._col1, KEY._col0:11._col2), count(DISTINCT KEY._col0:12._col0, KEY._col0:12._col1, KEY._col0:12._col2), count(DISTINCT KEY._col0:13._col0, KEY._col0:13._col1, KEY._col0:13._col2), count(DISTINCT KEY._col0:14._col0, KEY._col0:14._col1, KEY._col0:14._col2, KEY.
_col0:14._col3)
@@ -262,14 +190,10 @@ POSTHOOK: type: QUERY
POSTHOOK: Input: default@abcd
#### A masked pattern was here ####
7 7 6 6 6 7 3 3 6 7 4 5 6 6 5 6 4 5 5 5 4
-PREHOOK: query: explain vectorization expression select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
+PREHOOK: query: explain select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
+POSTHOOK: query: explain select a, count(distinct b), count(distinct c), sum(d) from abcd group by a
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -287,45 +211,20 @@ STAGE PLANS:
TableScan
alias: abcd
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
expressions: a (type: int), b (type: int), c (type: int), d (type: int)
outputColumnNames: a, b, c, d
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: a (type: int), b (type: int), c (type: int)
sort order: +++
Map-reduce partition columns: a (type: int)
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false, No DISTINCT columns IS false
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
value expressions: d (type: int)
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- notVectorizedReason: GROUPBY operator: DISTINCT not supported
- vectorized: false
Reduce Operator Tree:
Group By Operator
aggregations: count(DISTINCT KEY._col1:0._col0), count(DISTINCT KEY._col1:1._col0), sum(VALUE._col0)
@@ -359,14 +258,10 @@ POSTHOOK: Input: default@abcd
100 1 1 3
12 1 2 9
NULL 1 1 6
-PREHOOK: query: explain vectorization expression select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
+PREHOOK: query: explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
PREHOOK: type: QUERY
-POSTHOOK: query: explain vectorization expression select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
+POSTHOOK: query: explain select count(1), count(*), count(a), count(b), count(c), count(d), count(distinct a), count(distinct b), count(distinct c), count(distinct d), count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), count(distinct a,b,c,d) from abcd
POSTHOOK: type: QUERY
-PLAN VECTORIZATION:
- enabled: true
- enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
-
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-0 depends on stages: Stage-1
@@ -384,43 +279,18 @@ STAGE PLANS:
TableScan
alias: abcd
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
- TableScan Vectorization:
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Select Operator
expressions: a (type: int), b (type: int), c (type: int), d (type: int)
outputColumnNames: _col1, _col2, _col3, _col4
- Select Vectorization:
- className: VectorSelectOperator
- native: true
- projectedOutputColumns: [0, 1, 2, 3]
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
Reduce Output Operator
key expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int)
sort order: ++++
- Reduce Sink Vectorization:
- className: VectorReduceSinkOperator
- native: false
- nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
- nativeConditionsNotMet: Uniform Hash IS false, No DISTINCT columns IS false
Statistics: Num rows: 7 Data size: 100 Basic stats: COMPLETE Column stats: NONE
Execution mode: vectorized, llap
LLAP IO: all inputs
- Map Vectorization:
- enabled: true
- enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
- groupByVectorOutput: true
- inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
- allNative: false
- usesVectorUDFAdaptor: false
- vectorized: true
Reducer 2
Execution mode: llap
- Reduce Vectorization:
- enabled: true
- enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
- notVectorizedReason: GROUPBY operator: DISTINCT not supported
- vectorized: false
Reduce Operator Tree:
Group By Operator
aggregations: count(1), count(), count(KEY._col0:0._col0), count(KEY._col0:1._col0), count(KEY._col0:2._col0), count(KEY._col0:3._col0), count(DISTINCT KEY._col0:0._col0), count(DISTINCT KEY._col0:1._col0), count(DISTINCT KEY._col0:2._col0), count(DISTINCT KEY._col0:3._col0), count(DISTINCT KEY._col0:4._col0, KEY._col0:4._col1), count(DISTINCT KEY._col0:5._col0, KEY._col0:5._col1), count(DISTINCT KEY._col0:6._col0, KEY._col0:6._col1), count(DISTINCT KEY._col0:7._col0, KEY._col0:7._col1), count(DISTINCT KEY._col0:8._col0, KEY._col0:8._col1), count(DISTINCT KEY._col0:9._col0, KEY._col0:9._col1), count(DISTINCT KEY._col0:10._col0, KEY._col0:10._col1, KEY._col0:10._col2), count(DISTINCT KEY._col0:11._col0, KEY._col0:11._col1, KEY._col0:11._col2), count(DISTINCT KEY._col0:12._col0, KEY._col0:12._col1, KEY._col0:12._col2), count(DISTINCT KEY._col0:13._col0, KEY._col0:13._col1, KEY._col0:13._col2), count(DISTINCT KEY._col0:14._col0, KEY._col0:14._col1, KEY._col0:14._col2, K
EY._col0:14._col3)