You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Alex McLintock <al...@gmail.com> on 2011/02/07 20:30:57 UTC

Some basic ideas

A)
Am I right in thinking that no UDF can turn

(1, (2,3,4) )

into

(1, 2 )
(1, 3 )
(1, 4 )

because you always get out the same number of tuples as you put in?


B)
Would FLATTEN ($1) do that - if the (2,3,4) was a bag, and not a tuple?

I'm quite confused as to when bags get created and why they don't seem to be
top level data structures...

Alex

Re: Some basic ideas

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Alex,
In your UDF you can turn (1 , (2,3,4)) into { (1, 2), (1, 3), (1, 4) }

Which you can then FLATTEN to get

(1, 2)
(1, 3)
(1, 4)

Does that make sense?

On Mon, Feb 7, 2011 at 12:36 PM, Jonathan Coveney <jc...@gmail.com> wrote:
> I'll let someone else weigh in on B, but I'm pretty sure you can do what you
> want. Just make an EvalFunc that returns a databag, and then flatten it.
>
> 2011/2/7 Alex McLintock <al...@gmail.com>
>
>> A)
>> Am I right in thinking that no UDF can turn
>>
>> (1, (2,3,4) )
>>
>> into
>>
>> (1, 2 )
>> (1, 3 )
>> (1, 4 )
>>
>> because you always get out the same number of tuples as you put in?
>>
>>
>> B)
>> Would FLATTEN ($1) do that - if the (2,3,4) was a bag, and not a tuple?
>>
>> I'm quite confused as to when bags get created and why they don't seem to
>> be
>> top level data structures...
>>
>> Alex
>>
>

Re: Some basic ideas

Posted by Jonathan Coveney <jc...@gmail.com>.
I'll let someone else weigh in on B, but I'm pretty sure you can do what you
want. Just make an EvalFunc that returns a databag, and then flatten it.

2011/2/7 Alex McLintock <al...@gmail.com>

> A)
> Am I right in thinking that no UDF can turn
>
> (1, (2,3,4) )
>
> into
>
> (1, 2 )
> (1, 3 )
> (1, 4 )
>
> because you always get out the same number of tuples as you put in?
>
>
> B)
> Would FLATTEN ($1) do that - if the (2,3,4) was a bag, and not a tuple?
>
> I'm quite confused as to when bags get created and why they don't seem to
> be
> top level data structures...
>
> Alex
>