You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2009/11/12 05:17:39 UTC

[jira] Assigned: (PIG-429) Self join wth implicit split has the join output in wrong order

     [ https://issues.apache.org/jira/browse/PIG-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates reassigned PIG-429:
------------------------------

    Assignee: Pradeep Kamath

> Self join wth implicit split has the join output in wrong order
> ---------------------------------------------------------------
>
>                 Key: PIG-429
>                 URL: https://issues.apache.org/jira/browse/PIG-429
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: 0.2.0
>
>         Attachments: PIG-429.patch
>
>
> Query:
> {code}
> A = load 'st10k' split by 'file';
> B = filter A by $1 > 25;
> D = join A by $0, B by $0;
> dump D;
> {code}
> In the output the columns from B are projected out first and from A next. On closer examination of the code, the ImplicitSplitInserter class adds in the split and two splitoutput operators into the plan and tries the connect the successors of LOad to these. However it does this by iterating over its successors and disconnecting from them and connecting up the split-splitoutput to the successors. However the order in which it gets its successors is NOT the same as the order in which cogroup (join) expects its inputs. Hence the discrepancy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.