You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2017/02/02 05:48:51 UTC
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

    [ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849472#comment-15849472 ] 

Hive QA commented on HIVE-11394:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12850582/HIVE-11394.098.patch

{color:green}SUCCESS:{color} +1 due to 160 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 10993 tests executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=147)
	[dynamic_semijoin_reduction.q,load_dyn_part5.q,vector_complex_join.q,orc_llap.q,vectorization_7.q,vectorization_pushdown.q,cbo_gby.q,mapjoin3.q,auto_sortmerge_join_1.q,lineage3.q,cross_product_check_1.q,cbo_join.q,vector_struct_in.q,bucketmapjoin3.q,current_date_timestamp.q,orc_ppd_schema_evol_2a.q,groupby2.q,schema_evol_text_vec_table.q,vectorized_join46.q,orc_ppd_date.q,create_merge_compressed.q,multiMapJoin1.q,vector_outer_join1.q,vector_char_simple.q,dynpart_sort_optimization_acid.q,having.q,leftsemijoin.q,special_character_in_tabnames_1.q,cte_mat_2.q,vectorization_8.q]
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_semijoin_reduction] (batchId=139)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_inner_join] (batchId=162)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join0] (batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join1] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join2] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join3] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join4] (batchId=162)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join5] (batchId=162)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_div0] (batchId=94)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit] (batchId=93)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_between_in] (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_cast_constant] (batchId=99)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_count_distinct] (batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_aggregate] (batchId=103)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_mapjoin] (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=129)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_0] (batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_13] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_14] (batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_15] (batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_16] (batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_17] (batchId=132)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_9] (batchId=95)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_div0] (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress] (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_case] (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_mapjoin] (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_math_funcs] (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_nested_mapjoin] (batchId=102)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_ptf] (batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_shufflejoin] (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_string_funcs] (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_timestamp_funcs] (batchId=108)
org.apache.hadoop.hive.ql.optimizer.physical.TestVectorizer.testExprNodeBetweenWithDynamicValue (batchId=259)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3317/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3317/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3317/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 39 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12850582 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
>                 Key: HIVE-11394
>                 URL: https://issues.apache.org/jira/browse/HIVE-11394
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>             Fix For: 2.2.0
>
>         Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch, HIVE-11394.096.patch, HIVE-11394.097.patch, HIVE-11394.098.patch, HIVE-11394.09.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. predicateExpression.  It includes all information of SUMMARY and OPERATOR, too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---------------------------------------------------------------------------------------------------
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct<count:bigint,sum:double,input:int> of Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": "false" which says a node has a GROUP BY with an AVG or some other aggregator that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> ...
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
>         Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: alltypesorc
>                   Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE Column stats: COMPLETE
>                   Select Operator
>                     expressions: cint (type: int)
>                     outputColumnNames: cint
>                     Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE Column stats: COMPLETE
>                     Group By Operator
>                       keys: cint (type: int)
>                       mode: hash
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: false
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Group By Operator
>                 keys: KEY._col0 (type: int)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column stats: COMPLETE
>                 Group By Operator
>                   aggregations: sum(_col0), count(_col0), avg(_col0), std(_col0)
>                   mode: hash
>                   outputColumnNames: _col0, _col1, _col2, _col3
>                   Statistics: Num rows: 1 Data size: 172 Basic stats: COMPLETE Column stats: COMPLETE
>                   Reduce Output Operator
>                     sort order: 
>                     Statistics: Num rows: 1 Data size: 172 Basic stats: COMPLETE Column stats: COMPLETE
>                     value expressions: _col0 (type: bigint), _col1 (type: bigint), _col2 (type: struct<count:bigint,sum:double,input:int>), _col3 (type: struct<count:bigint,sum:double,variance:double>)
>         Reducer 3 
>             Execution mode: llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct<count:bigint,sum:double,input:int> of Column[VALUE._col2] not supported
>                 vectorized: false
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: sum(VALUE._col0), count(VALUE._col1), avg(VALUE._col2), std(VALUE._col3)
>                 mode: mergepartial
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
>                   table:
>                       input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink 
> {code}
> EXPLAIN VECTORIZATION OPERATOR
> Notice the added  TableScan Vectorization, Select Vectorization, Group By Vectorization, Map Join Vectorizatin, Reduce Sink Vectorization sections in this example.
> Notice the nativeConditionsMet detail on why Reduce Vectorization is native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Map 2 <- Map 1 (BROADCAST_EDGE)
>         Reducer 3 <- Map 2 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: a
>                   Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(10))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col1 (type: char(20))
>                         sort order: +
>                         Map-reduce partition columns: _col1 (type: char(20))
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkStringOperator
>                             native: true
>                             nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, Uniform Hash IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
>                         Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column stats: NONE
>                         value expressions: _col0 (type: int)
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: true
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: b
>                   Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(20))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column stats: NONE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         keys:
>                           0 _col1 (type: char(20))
>                           1 _col1 (type: char(20))
>                         Map Join Vectorization:
>                             className: VectorMapJoinInnerStringOperator
>                             native: true
>                             nativeConditionsMet: hive.vectorized.execution.mapjoin.native.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS true, No nullsafe IS true, Supports Key Types IS true, Not empty key IS true, When Fast Hash Table, then requires no Hybrid Hash Join IS true, Small table vectorizes IS true
>                         outputColumnNames: _col0, _col1, _col2, _col3
>                         input vertices:
>                           0 Map 1
>                         Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column stats: NONE
>                         Reduce Output Operator
>                           key expressions: _col0 (type: int)
>                           sort order: +
>                           Reduce Sink Vectorization:
>                               className: VectorReduceSinkOperator
>                               native: false
>                               nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
>                               nativeConditionsNotMet: Uniform Hash IS false
>                           Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column stats: NONE
>                           value expressions: _col1 (type: char(10)), _col2 (type: int), _col3 (type: char(20))
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 3 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: true
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Select Operator
>                 expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: char(10)), VALUE._col1 (type: int), VALUE._col2 (type: char(20))
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Select Vectorization:
>                     className: VectorSelectOperator
>                     native: true
>                     projectedOutputColumns: [0, 1, 2, 3]
>                 Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column stats: NONE
>                 File Output Operator
>                   compressed: false
>                   File Sink Vectorization:
>                       className: VectorFileSinkOperator
>                       native: false
>                   Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column stats: NONE
>                   table:
>                       input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
>  {code}
> EXPLAIN VECTORIZATION EXPRESSION
> Notice the predicateExpression in this example.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: vector_interval_2
>                   Statistics: Num rows: 2 Data size: 788 Basic stats: COMPLETE Column stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1, 2, 3, 4, 5]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
>                         predicateExpression: FilterExprAndExpr(children: FilterTimestampScalarEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampScalarNotEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampScalarLessEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampScalarLessTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampScalarGreaterEqualTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampScalarGreaterTimestampColumn(val 2001-01-01 01:02:03.0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColEqualTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColNotEqualTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColGreaterEqualTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColGreaterTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColLessEqualTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColLessTimestampScalar(col 6, val 2001-01-01 01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColEqualTimestampColumn(col 0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColNotEqualTimestampColumn(col 0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColLessEqualTimestampColumn(col 0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColLessTimestampColumn(col 0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColGreaterEqualTimestampColumn(col 0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColGreaterTimestampColumn(col 0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean) -> boolean
>                     predicate: ((2001-01-01 01:02:03.0 = (dt + 0 01:02:03.000000000)) and (2001-01-01 01:02:03.0 <> (dt + 0 01:02:04.000000000)) and (2001-01-01 01:02:03.0 <= (dt + 0 01:02:03.000000000)) and (2001-01-01 01:02:03.0 < (dt + 0 01:02:04.000000000)) and (2001-01-01 01:02:03.0 >= (dt - 0 01:02:03.000000000)) and (2001-01-01 01:02:03.0 > (dt - 0 01:02:04.000000000)) and ((dt + 0 01:02:03.000000000) = 2001-01-01 01:02:03.0) and ((dt + 0 01:02:04.000000000) <> 2001-01-01 01:02:03.0) and ((dt + 0 01:02:03.000000000) >= 2001-01-01 01:02:03.0) and ((dt + 0 01:02:04.000000000) > 2001-01-01 01:02:03.0) and ((dt - 0 01:02:03.000000000) <= 2001-01-01 01:02:03.0) and ((dt - 0 01:02:04.000000000) < 2001-01-01 01:02:03.0) and (ts = (dt + 0 01:02:03.000000000)) and (ts <> (dt + 0 01:02:04.000000000)) and (ts <= (dt + 0 01:02:03.000000000)) and (ts < (dt + 0 01:02:04.000000000)) and (ts >= (dt - 0 01:02:03.000000000)) and (ts > (dt - 0 01:02:04.000000000))) (type: boolean)
>                     Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE Column stats: NONE
>                     Select Operator
>                       expressions: ts (type: timestamp)
>                       outputColumnNames: _col0
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0]
>                       Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE Column stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: timestamp)
>                         sort order: +
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkOperator
>                             native: false
>                             nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true
>                             nativeConditionsNotMet: Uniform Hash IS false
>                         Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE Column stats: NONE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
> ... 
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' annotation marks each new class and method.
> Works for FORMATTED, like other non-vectorization EXPLAIN variations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)