You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "John P. Petrakis (JIRA)" <ji...@apache.org> on 2015/11/19 19:36:11 UTC

[jira] [Commented] (HIVE-2750) Hive multi group by single reducer optimization causes invalid column reference error

    [ https://issues.apache.org/jira/browse/HIVE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014110#comment-15014110 ] 

John P. Petrakis commented on HIVE-2750:
----------------------------------------

Unless we turn off group-by optimization, the query as described in Jira 12412, which worked fine in Hive 1.0 fails in Hive 1.1 and later.

> Hive multi group by single reducer optimization causes invalid column reference error
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-2750
>                 URL: https://issues.apache.org/jira/browse/HIVE-2750
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>             Fix For: 0.9.0
>
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2750.D1455.1.patch
>
>
> After the optimization, if two query blocks have the same distinct clause and the same group by keys, but the first query block does not reference all the rows the second query block does, an invalid column reference error is raised for the columns unreferenced in the first query block.
> E.g.
> FROM src
> INSERT OVERWRITE TABLE dest_g2 SELECT substr(src.key,1,1), count(DISTINCT src.key) WHERE substr(src.key,1,1) >= 5 GROUP BY substr(src.key,1,1)
> INSERT OVERWRITE TABLE dest_g3 SELECT substr(src.key,1,1), count(DISTINCT src.key), count(src.value) WHERE substr(src.key,1,1) < 5 GROUP BY substr(src.key,1,1);
> This results in an invalid column reference error on src.value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)