You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/01/06 00:59:44 UTC
[jira] Updated: (PIG-580) PERFORMANCE: Combiner should also be used
when there are distinct aggregates in a foreach following a group provided
there are no non-algebraics in the foreach
[ https://issues.apache.org/jira/browse/PIG-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-580:
-------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
patch committed; thanks, pradeep
> PERFORMANCE: Combiner should also be used when there are distinct aggregates in a foreach following a group provided there are no non-algebraics in the foreach
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PIG-580
> URL: https://issues.apache.org/jira/browse/PIG-580
> Project: Pig
> Issue Type: Improvement
> Affects Versions: types_branch
> Reporter: Pradeep Kamath
> Assignee: Pradeep Kamath
> Fix For: types_branch
>
> Attachments: PIG-580-v2.patch, PIG-580.patch
>
>
> Currently Pig uses the combiner only when there is foreach following a group when the elements in the foreach generate have the following characteristics:
> 1) simple project of the "group" column
> 2) Algebraic UDF
> The above conditions exclude use of the combiner for distinct aggregates - the distinct operation itself is combinable (irrespective of whether it feeds to an algebraic or non algebraic udf). So if the following foreach should also be combinable:
> {code}
> ..
> b = group a by $0;
> c = foreach b generate { x = distinct a; generate group, COUNT(x), SUM(x.$1) }
> {code}
> The combiner optimizer should cause the distinct to be combined and the final combine output should feed the COUNT() and SUM() in the reduce.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.