You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Jonathan Coveney <jc...@gmail.com> on 2012/06/21 20:29:16 UTC
Possible bug in replicated join?
Am posting before making a ticket just to make sure I'm not doing something
stupid or missing something obvious.
$ cat data
1
2
3
4
5
a = load 'data' as (x:int);
b = foreach a generate TOTUPLE(x);
c = load 'data' as (x:int);
d = foreach c generate TOTUPLE(x);
e = join b by $0, d by $0;
dump e;
((1),(1))
((2),(2))
((3),(3))
((4),(4))
((5),(5))
ok....
but
f = join b by $0, d by $0 using 'replicated';
dump f;
(1,1)
(2,2)
(3,3)
(4,4)
(5,5)
!!!!
Re: Possible bug in replicated join?
Posted by Thejas Nair <th...@hortonworks.com>.
That certainly looks like a bug. The replicated join should not flatten
the tuple.
I didn't actually know that pig supported doing joins on tuples (i guess
it does not allow that on maps and bags).
-Thejas
On 6/21/12 11:29 AM, Jonathan Coveney wrote:
> Am posting before making a ticket just to make sure I'm not doing something
> stupid or missing something obvious.
>
>
> $ cat data
>
> 1
>
> 2
>
> 3
>
> 4
>
> 5
>
>
> a = load 'data' as (x:int);
>
> b = foreach a generate TOTUPLE(x);
>
>
> c = load 'data' as (x:int);
>
> d = foreach c generate TOTUPLE(x);
>
>
> e = join b by $0, d by $0;
>
> dump e;
>
>
> ((1),(1))
>
> ((2),(2))
>
> ((3),(3))
>
> ((4),(4))
>
> ((5),(5))
>
> ok....
> but
> f = join b by $0, d by $0 using 'replicated';
>
> dump f;
>
>
> (1,1)
>
> (2,2)
>
> (3,3)
>
> (4,4)
>
> (5,5)
>
> !!!!
>