You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by yonghu <yo...@gmail.com> on 2012/10/07 21:00:16 UTC

A question of pig default load function

Dear all,

When I load the data stored in txt file into two bags, e.g.
{(1),(2),(1)}	{(1),(3)} using :
A = LOAD '/home/hans/test/r1.txt' AS
(b1:bag{t1:tuple(f1:int)},b2:bag{t2:tuple(f2:int)});
everything works fine.

But if I load the as the format
{(1),(2),(1)},{(1),(3)} using:
A = LOAD '/home/hans/test/r1.txt' AS
(b1:bag{t1:tuple(f1:int)},b2:bag{t2:tuple(f2:int)});
I only got the first part ({(1),(2),(1)},). I lost the second bag. Can
anyone tell me why?

regards!

Yong

Re: A question of pig default load function

Posted by Prashant Kommireddi <pr...@gmail.com>.
Default loader is PigStorage which takes '\t' as delimiter. In your
2nd example, you need to explicitly specify comma as a delimiter (load
'foo' using PigStorage(',') as ...)

Sent from my iPhone

On Oct 7, 2012, at 12:00 PM, yonghu <yo...@gmail.com> wrote:

> Dear all,
>
> When I load the data stored in txt file into two bags, e.g.
> {(1),(2),(1)}    {(1),(3)} using :
> A = LOAD '/home/hans/test/r1.txt' AS
> (b1:bag{t1:tuple(f1:int)},b2:bag{t2:tuple(f2:int)});
> everything works fine.
>
> But if I load the as the format
> {(1),(2),(1)},{(1),(3)} using:
> A = LOAD '/home/hans/test/r1.txt' AS
> (b1:bag{t1:tuple(f1:int)},b2:bag{t2:tuple(f2:int)});
> I only got the first part ({(1),(2),(1)},). I lost the second bag. Can
> anyone tell me why?
>
> regards!
>
> Yong