You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mridul Muralidharan (JIRA)" <ji...@apache.org> on 2010/08/01 12:51:18 UTC

[jira] Commented: (PIG-1530) PIG Logical Optimization: Push LOFilter above LOCogroup

    [ https://issues.apache.org/jira/browse/PIG-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894368#action_12894368 ] 

Mridul Muralidharan commented on PIG-1530:
------------------------------------------

This looks more like a developer coding issue - the filter does not depend on the cogroup in anyway and is a fairly specific pattern imo !


A = load '<any file>' USING PigStorage(',') as (a1:int,a2:int,a3:int);
B = load '<any file>' USING PigStorage(',') as (b1:int,b2:int,b3:int);
A1 = Filter A by a1 + 5 > a2;
B1 = Filter B by b1 + 5 > b2;

--- And rest : including possible cogroup.
G = COGROUP A1 by (a1,a2) , B1 by (b1,b2);
explain D;


>  PIG Logical Optimization: Push LOFilter above LOCogroup
> --------------------------------------------------------
>
>                 Key: PIG-1530
>                 URL: https://issues.apache.org/jira/browse/PIG-1530
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Swati Jain
>            Assignee: Swati Jain
>            Priority: Minor
>             Fix For: 0.8.0
>
>
> Consider the following:
> {noformat}
> A = load '<any file>' USING PigStorage(',') as (a1:int,a2:int,a3:int);
> B = load '<any file>' USING PigStorage(',') as (b1:int,b2:int,b3:int);
> G = COGROUP A by (a1,a2) , B by (b1,b2);
> D = Filter G by group.$0 + 5 > group.$1;
> explain D;
> {noformat}
> In the above example, LOFilter can be pushed above LOCogroup. Note there are some tricky NULL issues to think about when the Cogroup is not of type INNER (Similar to issues that need to be thought through when pushing LOFilter on the right side of a LeftOuterJoin).
> Also note that typically the LOFilter in user programs will be below a ForEach-Cogroup pair. To make this really useful, we need to also implement LOFilter pushed across ForEach. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.