You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Renato Marroquín Mogrovejo <re...@gmail.com> on 2010/07/22 03:33:00 UTC

Sorting a tuple's content

Hey everybody, Does any body know how I can sort a tuple's content?
For example, I have (770001,880001,990001,770001) and I would like to obtain
(770001,770001,880001,990001). I tried doing a group by the first field but
the thing is that I still get the whole tuple as a resultant bag.
Thanks in advanced.

Renato M.

Re: Sorting a tuple's content

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
it will not.

On Sun, Jul 25, 2010 at 9:58 PM, Renato Marroquín Mogrovejo <
renatoj.marroquin@gmail.com> wrote:

> Wouldn't that add extra overhead to the process? I mean to do an extra
> FOREACH ... GENERATE won't cause an extra Mapreduce job to be generated?
>
> Renato M.
>
> 2010/7/25 Jai Krishna <rj...@yahoo-inc.com>
>
> > Ok. That helps.
> > So for this situation, we need not write a new UDF; we can just use
> > FOREACH...GENERATE to rearrange the tuple columns.
> >
> > -RJK
> >
> >
> > On 7/23/10 1:13 PM, "Harsh J" <qw...@gmail.com> wrote:
> >
> > Yes, that _will_ guarantee that the ordering is what you've specified.
> >
> > On Fri, Jul 23, 2010 at 11:33 AM, Jai Krishna <rj...@yahoo-inc.com> wrote:
> > > So a question on a related note, is there any correlation between the
> way
> > the tuple was constructed and the order of items in a Tuple?
> > >
> > > E.g.
> > >
> > > FOREACH foo GENERATE $1, $2, $3, $4
> > >
> > > Would that guarantee (or not) that the ordering inside the tuple would
> > also be ($1, $2, $3, $4)
> > >
> > > -RJK
> > >
> > > P.S: Im new to Pig so Im probably missing many things.
> > >
> > > On 7/22/10 11:56 PM, "Renato Marroquín Mogrovejo" <
> > renatoj.marroquin@gmail.com> wrote:
> > >
> > > Thanks there Dmitriy. I will write my own then.
> > >
> > > Renato M.
> > >
> > > 2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>
> > >
> > >> that has to be a UDF, there is nothing built in for this.
> > >>
> > >> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
> > >> renatoj.marroquin@gmail.com> wrote:
> > >>
> > >> > Hey everybody, Does any body know how I can sort a tuple's content?
> > >> > For example, I have (770001,880001,990001,770001) and I would like
> to
> > >> > obtain
> > >> > (770001,770001,880001,990001). I tried doing a group by the first
> > field
> > >> but
> > >> > the thing is that I still get the whole tuple as a resultant bag.
> > >> > Thanks in advanced.
> > >> >
> > >> > Renato M.
> > >> >
> > >>
> > >
> > >
> >
> >
> >
> > --
> > Harsh J
> > www.harshj.com
> >
> >
>

Re: Sorting a tuple's content

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Wouldn't that add extra overhead to the process? I mean to do an extra
FOREACH ... GENERATE won't cause an extra Mapreduce job to be generated?

Renato M.

2010/7/25 Jai Krishna <rj...@yahoo-inc.com>

> Ok. That helps.
> So for this situation, we need not write a new UDF; we can just use
> FOREACH...GENERATE to rearrange the tuple columns.
>
> -RJK
>
>
> On 7/23/10 1:13 PM, "Harsh J" <qw...@gmail.com> wrote:
>
> Yes, that _will_ guarantee that the ordering is what you've specified.
>
> On Fri, Jul 23, 2010 at 11:33 AM, Jai Krishna <rj...@yahoo-inc.com> wrote:
> > So a question on a related note, is there any correlation between the way
> the tuple was constructed and the order of items in a Tuple?
> >
> > E.g.
> >
> > FOREACH foo GENERATE $1, $2, $3, $4
> >
> > Would that guarantee (or not) that the ordering inside the tuple would
> also be ($1, $2, $3, $4)
> >
> > -RJK
> >
> > P.S: Im new to Pig so Im probably missing many things.
> >
> > On 7/22/10 11:56 PM, "Renato Marroquín Mogrovejo" <
> renatoj.marroquin@gmail.com> wrote:
> >
> > Thanks there Dmitriy. I will write my own then.
> >
> > Renato M.
> >
> > 2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>
> >
> >> that has to be a UDF, there is nothing built in for this.
> >>
> >> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
> >> renatoj.marroquin@gmail.com> wrote:
> >>
> >> > Hey everybody, Does any body know how I can sort a tuple's content?
> >> > For example, I have (770001,880001,990001,770001) and I would like to
> >> > obtain
> >> > (770001,770001,880001,990001). I tried doing a group by the first
> field
> >> but
> >> > the thing is that I still get the whole tuple as a resultant bag.
> >> > Thanks in advanced.
> >> >
> >> > Renato M.
> >> >
> >>
> >
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>
>

Re: Sorting a tuple's content

Posted by Jai Krishna <rj...@yahoo-inc.com>.
Ok. That helps.
So for this situation, we need not write a new UDF; we can just use FOREACH...GENERATE to rearrange the tuple columns.

-RJK


On 7/23/10 1:13 PM, "Harsh J" <qw...@gmail.com> wrote:

Yes, that _will_ guarantee that the ordering is what you've specified.

On Fri, Jul 23, 2010 at 11:33 AM, Jai Krishna <rj...@yahoo-inc.com> wrote:
> So a question on a related note, is there any correlation between the way the tuple was constructed and the order of items in a Tuple?
>
> E.g.
>
> FOREACH foo GENERATE $1, $2, $3, $4
>
> Would that guarantee (or not) that the ordering inside the tuple would also be ($1, $2, $3, $4)
>
> -RJK
>
> P.S: Im new to Pig so Im probably missing many things.
>
> On 7/22/10 11:56 PM, "Renato Marroquín Mogrovejo" <re...@gmail.com> wrote:
>
> Thanks there Dmitriy. I will write my own then.
>
> Renato M.
>
> 2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>
>
>> that has to be a UDF, there is nothing built in for this.
>>
>> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
>> renatoj.marroquin@gmail.com> wrote:
>>
>> > Hey everybody, Does any body know how I can sort a tuple's content?
>> > For example, I have (770001,880001,990001,770001) and I would like to
>> > obtain
>> > (770001,770001,880001,990001). I tried doing a group by the first field
>> but
>> > the thing is that I still get the whole tuple as a resultant bag.
>> > Thanks in advanced.
>> >
>> > Renato M.
>> >
>>
>
>



--
Harsh J
www.harshj.com


Re: Sorting a tuple's content

Posted by Harsh J <qw...@gmail.com>.
Yes, that _will_ guarantee that the ordering is what you've specified.

On Fri, Jul 23, 2010 at 11:33 AM, Jai Krishna <rj...@yahoo-inc.com> wrote:
> So a question on a related note, is there any correlation between the way the tuple was constructed and the order of items in a Tuple?
>
> E.g.
>
> FOREACH foo GENERATE $1, $2, $3, $4
>
> Would that guarantee (or not) that the ordering inside the tuple would also be ($1, $2, $3, $4)
>
> -RJK
>
> P.S: Im new to Pig so Im probably missing many things.
>
> On 7/22/10 11:56 PM, "Renato Marroquín Mogrovejo" <re...@gmail.com> wrote:
>
> Thanks there Dmitriy. I will write my own then.
>
> Renato M.
>
> 2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>
>
>> that has to be a UDF, there is nothing built in for this.
>>
>> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
>> renatoj.marroquin@gmail.com> wrote:
>>
>> > Hey everybody, Does any body know how I can sort a tuple's content?
>> > For example, I have (770001,880001,990001,770001) and I would like to
>> > obtain
>> > (770001,770001,880001,990001). I tried doing a group by the first field
>> but
>> > the thing is that I still get the whole tuple as a resultant bag.
>> > Thanks in advanced.
>> >
>> > Renato M.
>> >
>>
>
>



-- 
Harsh J
www.harshj.com

Re: Sorting a tuple's content

Posted by Jai Krishna <rj...@yahoo-inc.com>.
So a question on a related note, is there any correlation between the way the tuple was constructed and the order of items in a Tuple?

E.g.

FOREACH foo GENERATE $1, $2, $3, $4

Would that guarantee (or not) that the ordering inside the tuple would also be ($1, $2, $3, $4)

-RJK

P.S: Im new to Pig so Im probably missing many things.

On 7/22/10 11:56 PM, "Renato Marroquín Mogrovejo" <re...@gmail.com> wrote:

Thanks there Dmitriy. I will write my own then.

Renato M.

2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>

> that has to be a UDF, there is nothing built in for this.
>
> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>
> > Hey everybody, Does any body know how I can sort a tuple's content?
> > For example, I have (770001,880001,990001,770001) and I would like to
> > obtain
> > (770001,770001,880001,990001). I tried doing a group by the first field
> but
> > the thing is that I still get the whole tuple as a resultant bag.
> > Thanks in advanced.
> >
> > Renato M.
> >
>


Re: Sorting a tuple's content

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Thanks there Dmitriy. I will write my own then.

Renato M.

2010/7/21 Dmitriy Ryaboy <dv...@gmail.com>

> that has to be a UDF, there is nothing built in for this.
>
> On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
> renatoj.marroquin@gmail.com> wrote:
>
> > Hey everybody, Does any body know how I can sort a tuple's content?
> > For example, I have (770001,880001,990001,770001) and I would like to
> > obtain
> > (770001,770001,880001,990001). I tried doing a group by the first field
> but
> > the thing is that I still get the whole tuple as a resultant bag.
> > Thanks in advanced.
> >
> > Renato M.
> >
>

Re: Sorting a tuple's content

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
that has to be a UDF, there is nothing built in for this.

On Wed, Jul 21, 2010 at 6:33 PM, Renato Marroquín Mogrovejo <
renatoj.marroquin@gmail.com> wrote:

> Hey everybody, Does any body know how I can sort a tuple's content?
> For example, I have (770001,880001,990001,770001) and I would like to
> obtain
> (770001,770001,880001,990001). I tried doing a group by the first field but
> the thing is that I still get the whole tuple as a resultant bag.
> Thanks in advanced.
>
> Renato M.
>