You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dmitriy Ryaboy <dv...@gmail.com> on 2009/04/11 05:32:24 UTC

Tuple ordering after a group-by

Hi,
Is there any contract regarding the ordering of tuples inside a group
after a Group By operation?

Meaning, are both of these outcomes possible:

(foo,  {(foo, bar, baz), (foo, fie, foe)}
and
(ffoo, {(foo, fie, foe), (foo, bar, baz)}

?

Thanks,

-Dmitriy

Re: Tuple ordering after a group-by

Posted by Alan Gates <ga...@yahoo-inc.com>.
No, there is never an ordering guarantee on the tuples in a bag except  
immediately after an ORDER BY is done on that bag.  So if you need the  
tuples ordered after the group by, order it:

B = GROUP A BY $0;
C = FOREACH B {
        C1 = ORDER A BY $0; -- or whatever column in A you need it  
ordered by
        GENERATE ...
}

Alan.

On Apr 10, 2009, at 8:32 PM, Dmitriy Ryaboy wrote:

> Hi,
> Is there any contract regarding the ordering of tuples inside a group
> after a Group By operation?
>
> Meaning, are both of these outcomes possible:
>
> (foo,  {(foo, bar, baz), (foo, fie, foe)}
> and
> (ffoo, {(foo, fie, foe), (foo, bar, baz)}
>
> ?
>
> Thanks,
>
> -Dmitriy