You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Richipal Singh <ri...@gmail.com> on 2012/09/27 22:23:18 UTC

Pig split and join

I have a requirement to propagate field values from one row to another
given type of record for example my raw input is

1,firefox,p
1,,q
1,,r
1,,s
2,ie,p
2,,s
3,chrome,p
3,,r
3,,s
4,netscape,p

the desired result

1,firefox,p
1,firefox,q
1,firefox,r
1,firefox,s
2,ie,p
2,ie,s
3,chrome,p
3,chrome,r
3,chrome,s
4,netscape,p

I tried

A = LOAD 'file1.txt' using PigStorage(',') AS
(id:int,browser:chararray,type:chararray);
SPLIT A INTO B IF (type =='p'), C IF (type!='p' );
joined =  JOIN B BY id FULL, C BY id;
joinedFields = FOREACH joined GENERATE  B::id,  B::type, B::browser,
C::id, C::type;
dump joinedFields;

the result I got was

(,,,1,p  )
(,,,1,q)
(,,,1,r)
(,,,1,s)
(2,p,ie,2,s)
(3,p,chrome,3,r)
(3,p,chrome,3,s)
(4,p,netscape,,)

Any help is appreciated, Thanks.

--
Richipal Singh

Re: Pig split and join

Posted by Richipal Singh <ri...@gmail.com>.
Thank you Cheolsoo, this worked, I really appreciate your time and help.

--
Richipal Singh



On Thu, Sep 27, 2012 at 5:11 PM, Cheolsoo Park <ch...@cloudera.com>wrote:

> Hi Richipal,
>
> Please try this:
>
> a = LOAD '1.txt' USING PigStorage(',') AS
> (id:int,browser:chararray,type:chararray);
> b = FOREACH a GENERATE $0, $1;
> c = FILTER b by ($1 is not null);
> d = FOREACH a GENERATE $0, $2;
> e = JOIN c by id, d by id;
> f = FOREACH e GENERATE $0, $1, $3;
> dump f;
>
> This returns:
>
> (1,firefox,p)
> (1,firefox,q)
> (1,firefox,r)
> (1,firefox,s)
> (2,ie,p)
> (2,ie,s)
> (3,chrome,p)
> (3,chrome,r)
> (3,chrome,s)
> (4,netscape,p)
>
> Thanks,
> Cheolsoo
>
> On Thu, Sep 27, 2012 at 1:23 PM, Richipal Singh <ri...@gmail.com>
> wrote:
>
> > I have a requirement to propagate field values from one row to another
> > given type of record for example my raw input is
> >
> > 1,firefox,p
> > 1,,q
> > 1,,r
> > 1,,s
> > 2,ie,p
> > 2,,s
> > 3,chrome,p
> > 3,,r
> > 3,,s
> > 4,netscape,p
> >
> > the desired result
> >
> > 1,firefox,p
> > 1,firefox,q
> > 1,firefox,r
> > 1,firefox,s
> > 2,ie,p
> > 2,ie,s
> > 3,chrome,p
> > 3,chrome,r
> > 3,chrome,s
> > 4,netscape,p
> >
> > I tried
> >
> > A = LOAD 'file1.txt' using PigStorage(',') AS
> > (id:int,browser:chararray,type:chararray);
> > SPLIT A INTO B IF (type =='p'), C IF (type!='p' );
> > joined =  JOIN B BY id FULL, C BY id;
> > joinedFields = FOREACH joined GENERATE  B::id,  B::type, B::browser,
> > C::id, C::type;
> > dump joinedFields;
> >
> > the result I got was
> >
> > (,,,1,p  )
> > (,,,1,q)
> > (,,,1,r)
> > (,,,1,s)
> > (2,p,ie,2,s)
> > (3,p,chrome,3,r)
> > (3,p,chrome,3,s)
> > (4,p,netscape,,)
> >
> > Any help is appreciated, Thanks.
> >
> > --
> > Richipal Singh
> >
>

Re: Pig split and join

Posted by Cheolsoo Park <ch...@cloudera.com>.
Hi Richipal,

Please try this:

a = LOAD '1.txt' USING PigStorage(',') AS
(id:int,browser:chararray,type:chararray);
b = FOREACH a GENERATE $0, $1;
c = FILTER b by ($1 is not null);
d = FOREACH a GENERATE $0, $2;
e = JOIN c by id, d by id;
f = FOREACH e GENERATE $0, $1, $3;
dump f;

This returns:

(1,firefox,p)
(1,firefox,q)
(1,firefox,r)
(1,firefox,s)
(2,ie,p)
(2,ie,s)
(3,chrome,p)
(3,chrome,r)
(3,chrome,s)
(4,netscape,p)

Thanks,
Cheolsoo

On Thu, Sep 27, 2012 at 1:23 PM, Richipal Singh <ri...@gmail.com> wrote:

> I have a requirement to propagate field values from one row to another
> given type of record for example my raw input is
>
> 1,firefox,p
> 1,,q
> 1,,r
> 1,,s
> 2,ie,p
> 2,,s
> 3,chrome,p
> 3,,r
> 3,,s
> 4,netscape,p
>
> the desired result
>
> 1,firefox,p
> 1,firefox,q
> 1,firefox,r
> 1,firefox,s
> 2,ie,p
> 2,ie,s
> 3,chrome,p
> 3,chrome,r
> 3,chrome,s
> 4,netscape,p
>
> I tried
>
> A = LOAD 'file1.txt' using PigStorage(',') AS
> (id:int,browser:chararray,type:chararray);
> SPLIT A INTO B IF (type =='p'), C IF (type!='p' );
> joined =  JOIN B BY id FULL, C BY id;
> joinedFields = FOREACH joined GENERATE  B::id,  B::type, B::browser,
> C::id, C::type;
> dump joinedFields;
>
> the result I got was
>
> (,,,1,p  )
> (,,,1,q)
> (,,,1,r)
> (,,,1,s)
> (2,p,ie,2,s)
> (3,p,chrome,3,r)
> (3,p,chrome,3,s)
> (4,p,netscape,,)
>
> Any help is appreciated, Thanks.
>
> --
> Richipal Singh
>