You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2013/01/07 12:42:14 UTC

[jira] [Commented] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

    [ https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545819#comment-13545819 ] 

Namit Jain commented on HIVE-3852:
----------------------------------

[~navis], I had a higher level question.
Should we have this optimization now ?
I mean, is this really needed with map-side aggregates, or can we remove this code completely ?
                
> Multi-groupby optimization fails when same distinct column is used twice or more
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-3852
>                 URL: https://issues.apache.org/jira/browse/HIVE-3852
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-3852.D7737.1.patch
>
>
> {code}
> FROM INPUT
> INSERT OVERWRITE TABLE dest1 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct substr(INPUT.value,5)) GROUP BY INPUT.key
> INSERT OVERWRITE TABLE dest2 
> SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct substr(INPUT.value,5)) GROUP BY INPUT.key;
> {code}
> fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira