You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Matt McCline (JIRA)" <ji...@apache.org> on 2018/04/04 20:28:00 UTC
[jira] [Updated] (HIVE-12369) Native Vector GroupBy
[ https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt McCline updated HIVE-12369:
--------------------------------
Description:
Implement Native Vector GroupBy using fast hash table technology developed for Native Vector MapJoin, etc.
Patch is currently limited to a single Long key with a single COUNT aggregation. Or, a single Long key and no aggregation also known as duplicate reduction.
3 new classes introduces that stored the count in the slot table and don't allocate hash elements:
{noformat}
COUNT(column) VectorGroupByHashLongKeyCountColumnOperator
COUNT(key) VectorGroupByHashLongKeyCountKeyOperator
COUNT(*) VectorGroupByHashLongKeyCountStarOperator
{noformat}
And the duplicate reduction operator a single Long key:
{noformat}
VectorGroupByHashLongKeyDuplicateReductionOperator
{noformat}
was:
Implement Native Vector GroupBy using fast hash table technology developed for Native Vector MapJoin, etc.
Patch is currently limited to a single Long key, aggregation on Long columns, no more than 31 columns.
3 new classes introduces that stored the count in the slot table and don't allocate hash elements:
{noformat}
COUNT(column) VectorGroupByHashOneLongKeyCountColumnOperator
COUNT(key) VectorGroupByHashOneLongKeyCountKeyOperator
COUNT(*) VectorGroupByHashOneLongKeyCountStarOperator
{noformat}
And a new class that aggregates a single Long key:
{noformat}
VectorGroupByHashOneLongKeyOperator
{noformat}
> Native Vector GroupBy
> ---------------------
>
> Key: HIVE-12369
> URL: https://issues.apache.org/jira/browse/HIVE-12369
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
> Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, HIVE-12369.05.patch, HIVE-12369.06.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed for Native Vector MapJoin, etc.
> Patch is currently limited to a single Long key with a single COUNT aggregation. Or, a single Long key and no aggregation also known as duplicate reduction.
> 3 new classes introduces that stored the count in the slot table and don't allocate hash elements:
> {noformat}
> COUNT(column) VectorGroupByHashLongKeyCountColumnOperator
> COUNT(key) VectorGroupByHashLongKeyCountKeyOperator
> COUNT(*) VectorGroupByHashLongKeyCountStarOperator
> {noformat}
> And the duplicate reduction operator a single Long key:
> {noformat}
> VectorGroupByHashLongKeyDuplicateReductionOperator
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)