You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2014/08/02 22:41:11 UTC

[jira] [Commented] (HIVE-7405) Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)

    [ https://issues.apache.org/jira/browse/HIVE-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083708#comment-14083708 ] 

Hive QA commented on HIVE-7405:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12659484/HIVE-7405.2.patch

{color:red}ERROR:{color} -1 due to 134 failed/errored test(s), 5859 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_nested_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_casts
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgDecimalNegative
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongRepeatConcatValues
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testAvgLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testBigintKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testBooleanKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongRepeatConcatValues
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountStar
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountStringAllNull
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountStringWithNull
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDecimalKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeAvg
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeAvgOneKey
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeCount
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeMax
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeMaxOneKey
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeMin
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeMinOneKey
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeSum
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeSumOneKey
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeVariance
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testDoubleValueTypeVarianceOneKey
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testFloatKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testIntKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongMaxInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongMaxLong
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongNegative
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxNullString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMaxString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMemoryPressureFlush
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongConcatRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongGroupBy
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongKeyGroupByCompactBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongKeyGroupByCrossBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongMinInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongMinLong
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongNegative
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongNullKeyGroupByCrossBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongNullStringKeys
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongRepeatConcatValues
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinLongStringKeys
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinNullLongNullKeyGroupBy
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMinString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyDoubleShortString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyDoubleStringInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyIntStringInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyIntStringString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyStringByteString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testMultiKeyStringIntString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSmallintKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdDevLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdDevLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdDevSampLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdDevSampSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdPopDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testStdSampDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumDecimalHive6508
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumDoubleGroupByString
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumDoubleSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLong2MaxInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLong2MaxLong
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLong2MinInt
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLong2MinLong
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongMinMaxLong
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongRepeatConcatValues
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testSumLongZero
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testTimestampKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testTinyintKeyTypeAggregate
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarLongNullKeyGroupBySingleBatch
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarPopLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarPopLongRepeatNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarSampDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarSampLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarSampLongRepeat
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarSampLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarianceDecimal
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarianceLongEmpty
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarianceLongNulls
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarianceLongSimple
org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testVarianceLongSingle
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testBooleanColumnCompareBooleanScalar
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testFilterBooleanColumnCompareBooleanScalar
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testFilterScalarCompareColumn
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testFilterStringColCompareStringColumnExpressions
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testFilterWithNegativeScalar
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testStringFilterExpressions
org.apache.hadoop.hive.ql.optimizer.physical.TestVectorizer.testValidateNestedExpressions
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/150/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/150/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-150/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 134 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12659484

> Vectorize GROUP BY on the Reduce-Side (Part 1 – Basic)
> ------------------------------------------------------
>
>                 Key: HIVE-7405
>                 URL: https://issues.apache.org/jira/browse/HIVE-7405
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>         Attachments: HIVE-7405.1.patch, HIVE-7405.2.patch
>
>
> Vectorize the basic case that does not have any count distinct aggregation.
> Add a 4th processing mode in VectorGroupByOperator for reduce where each input VectorizedRowBatch has only values for one key at a time.  Thus, the values in the batch can be aggregated quickly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)