You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Terry Siu <Te...@datasphere.com> on 2012/09/18 19:11:35 UTC

Replacing elements in a bag via a join

Hi,

I'm having a little difficulty figuring this out. I have the following data in Pig, let's say aliased to 'data':

(a1,b1,{(id1),(id2)})
(a2,b2,{(id3)})
(a3,b3,{(id2),(id4)})

Each tuple above has an associated bag of tuples that are lookup ids. Given the following 'data', I also load the lookup id and their corresponding values, which I'll alias to 'lookup':

(id1,Hello)
(id2,World)
(id3,foo)
(id4,bar)

Is it possible, from the two inputs above, to create the following:

(a1,b1,{(Hello),(World)})
(a2,b2,{(foo)})
(a3,b3,{(World),(bar)})

Thanks,
-Terry

Re: Replacing elements in a bag via a join

Posted by Ruslan Al-Fakikh <me...@gmail.com>.
Hi Terry,

It looks like you should FLATTEN the data relation first, so that your
ids could be not nested and then join like this (or just remove GROUP
statement):
joined = JOIN dataFlattened by id, lookup by id USING 'replicated';
(the replacated join is recommended if your lookup relation is small)
After that you can add another FOREACH to eliminate ids and another
GROUP to group again.
Maybe there is some other better approaches.
Let me know if you are experiencing problems with this approach.

Ruslan

On Tue, Sep 18, 2012 at 9:11 PM, Terry Siu <Te...@datasphere.com> wrote:
> Hi,
>
> I'm having a little difficulty figuring this out. I have the following data in Pig, let's say aliased to 'data':
>
> (a1,b1,{(id1),(id2)})
> (a2,b2,{(id3)})
> (a3,b3,{(id2),(id4)})
>
> Each tuple above has an associated bag of tuples that are lookup ids. Given the following 'data', I also load the lookup id and their corresponding values, which I'll alias to 'lookup':
>
> (id1,Hello)
> (id2,World)
> (id3,foo)
> (id4,bar)
>
> Is it possible, from the two inputs above, to create the following:
>
> (a1,b1,{(Hello),(World)})
> (a2,b2,{(foo)})
> (a3,b3,{(World),(bar)})
>
> Thanks,
> -Terry