You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2008/09/11 03:32:44 UTC

[jira] Created: (HADOOP-4156) [hive] duplicate expression elimination for group by stage 1

[hive] duplicate expression elimination for group by stage 1
------------------------------------------------------------

                 Key: HADOOP-4156
                 URL: https://issues.apache.org/jira/browse/HADOOP-4156
             Project: Hadoop Core
          Issue Type: Bug
            Reporter: Namit Jain


In the first job we evaluate all the input columns + all group by clause expressions + parameters to all the aggregation functions and do not eliminate all the duplicates because we treat expression resolution and column resolution differently.

Consider the following:

src(key, value)

select src.key, sum(src.value) from src group by src;

Both src.key and src.value will be added twice - one from src's row resolver and one each from group expression and parameter. It is needed that way filter looks at (table, column) in row resolver, whereas group by expression looks
at (", COLREF table column)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HADOOP-4156) [hive] duplicate expression elimination for group by stage 1

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-4156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HADOOP-4156:
----------------------------------

    Assignee: Namit Jain

> [hive] duplicate expression elimination for group by stage 1
> ------------------------------------------------------------
>
>                 Key: HADOOP-4156
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4156
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> In the first job we evaluate all the input columns + all group by clause expressions + parameters to all the aggregation functions and do not eliminate all the duplicates because we treat expression resolution and column resolution differently.
> Consider the following:
> src(key, value)
> select src.key, sum(src.value) from src group by src;
> Both src.key and src.value will be added twice - one from src's row resolver and one each from group expression and parameter. It is needed that way filter looks at (table, column) in row resolver, whereas group by expression looks
> at (", COLREF table column)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.