You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Ramana Venkata <ra...@ohana-media.com> on 2010/09/09 11:45:39 UTC

how to find Difference between Two columns in In TWO RELATIONS

hi,
  I have Two files, loaded as Two relations A and B as fallows

File1.txt
--------------
ramana
krishna
siva
venkat

File2.txt
---------------
krishna
venkat
kishore
basha

these two files are loaded into two relations A and B
and then
the output should be like the difference of the two i.e B - A
that means the extra added added users in B without A

output
---------------------
kishore
basha


how we can write the pig script for this operation

Re: how to find Difference between Two columns in In TWO RELATIONS

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
C = JOIN A by name LEFT OUTER, B by name;
D = FILTER C by B::name is null;

On Thu, Sep 9, 2010 at 2:45 AM, Ramana Venkata <ra...@ohana-media.com>wrote:

> hi,
>  I have Two files, loaded as Two relations A and B as fallows
>
> File1.txt
> --------------
> ramana
> krishna
> siva
> venkat
>
> File2.txt
> ---------------
> krishna
> venkat
> kishore
> basha
>
> these two files are loaded into two relations A and B
> and then
> the output should be like the difference of the two i.e B - A
> that means the extra added added users in B without A
>
> output
> ---------------------
> kishore
> basha
>
>
> how we can write the pig script for this operation
>

Re: how to find Difference between Two columns in In TWO RELATIONS

Posted by "Ankur C. Goel" <ga...@yahoo-inc.com>.
Probably a bit late but I didn't see any replies to this so here it is

A = LOAD 'A' as (name:chararray);
B = LOAD 'B' as (name:chararray);
C = COGROUP B BY name, A by name;
D = FILTER C BY SIZE(A) == 0;
E = FOREACH D GENERATE FLATTEN(B);
dump E;



On 9/9/10 3:15 PM, "Ramana Venkata" <ra...@ohana-media.com> wrote:

hi,
  I have Two files, loaded as Two relations A and B as fallows

File1.txt
--------------
ramana
krishna
siva
venkat

File2.txt
---------------
krishna
venkat
kishore
basha

these two files are loaded into two relations A and B
and then
the output should be like the difference of the two i.e B - A
that means the extra added added users in B without A

output
---------------------
kishore
basha


how we can write the pig script for this operation