You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Saumitra Shahapure <sa...@gmail.com> on 2011/06/14 21:46:36 UTC
LOAD clause with Bag
Hello,
When we have LOAD clause with Bag as its member, what type of input file
structure is expected? Can default PigStorage() function handle that?
e.g. in A = LOAD 'data.txt' AS (B: bag {T: tuple(t1:int, t2:int,
t3:int)});
What structure of data.txt is expected? Is it possible to write StoreFunc in
this case?
Also if we have multidimensional data (like 2D n*n matrix, n varies with
input), can we expect Bag which contains each row as nested Bag?
--
Saumitra S. Shahapure
Re: LOAD clause with Bag
Posted by Daniel Dai <ji...@yahoo-inc.com>.
I opened a Jira https://issues.apache.org/jira/browse/PIG-2126 to
address it.
Daniel
On 06/14/2011 03:52 PM, Russell Jurney wrote:
> Is this in the docs yet? I've been asked this question a dozen times.
>
> Russ
>
> On Tue, Jun 14, 2011 at 3:02 PM, Daniel Dai<ji...@yahoo-inc.com> wrote:
>
>> data.txt can be:
>> ({(1,2,3),(4,5,6)})
>> ({(7,8,9),(10,11,12)}
>>
>> Also bag can be nested, eg:
>> A = LOAD 'data.txt' AS (B: bag {t:tuple(BB: bag {tt:tuple(t1:int, t2:int,
>> t3:int)})});
>>
>> data.txt:
>> {({(1,2,3),(4,5,6)}),({(7,8,9),(10,11,12)})}
>>
>> Daniel
>>
>>
>> On 06/14/2011 12:46 PM, Saumitra Shahapure wrote:
>>
>>> Hello,
>>>
>>> When we have LOAD clause with Bag as its member, what type of input file
>>> structure is expected? Can default PigStorage() function handle that?
>>> e.g. in A = LOAD 'data.txt' AS (B: bag {T: tuple(t1:int, t2:int,
>>> t3:int)});
>>> What structure of data.txt is expected? Is it possible to write StoreFunc
>>> in
>>> this case?
>>>
>>> Also if we have multidimensional data (like 2D n*n matrix, n varies with
>>> input), can we expect Bag which contains each row as nested Bag?
>>>
>>
Re: LOAD clause with Bag
Posted by Russell Jurney <ru...@gmail.com>.
Is this in the docs yet? I've been asked this question a dozen times.
Russ
On Tue, Jun 14, 2011 at 3:02 PM, Daniel Dai <ji...@yahoo-inc.com> wrote:
> data.txt can be:
> ({(1,2,3),(4,5,6)})
> ({(7,8,9),(10,11,12)}
>
> Also bag can be nested, eg:
> A = LOAD 'data.txt' AS (B: bag {t:tuple(BB: bag {tt:tuple(t1:int, t2:int,
> t3:int)})});
>
> data.txt:
> {({(1,2,3),(4,5,6)}),({(7,8,9),(10,11,12)})}
>
> Daniel
>
>
> On 06/14/2011 12:46 PM, Saumitra Shahapure wrote:
>
>> Hello,
>>
>> When we have LOAD clause with Bag as its member, what type of input file
>> structure is expected? Can default PigStorage() function handle that?
>> e.g. in A = LOAD 'data.txt' AS (B: bag {T: tuple(t1:int, t2:int,
>> t3:int)});
>> What structure of data.txt is expected? Is it possible to write StoreFunc
>> in
>> this case?
>>
>> Also if we have multidimensional data (like 2D n*n matrix, n varies with
>> input), can we expect Bag which contains each row as nested Bag?
>>
>
>
Re: LOAD clause with Bag
Posted by Daniel Dai <ji...@yahoo-inc.com>.
data.txt can be:
({(1,2,3),(4,5,6)})
({(7,8,9),(10,11,12)}
Also bag can be nested, eg:
A = LOAD 'data.txt' AS (B: bag {t:tuple(BB: bag {tt:tuple(t1:int,
t2:int, t3:int)})});
data.txt:
{({(1,2,3),(4,5,6)}),({(7,8,9),(10,11,12)})}
Daniel
On 06/14/2011 12:46 PM, Saumitra Shahapure wrote:
> Hello,
>
> When we have LOAD clause with Bag as its member, what type of input file
> structure is expected? Can default PigStorage() function handle that?
> e.g. in A = LOAD 'data.txt' AS (B: bag {T: tuple(t1:int, t2:int,
> t3:int)});
> What structure of data.txt is expected? Is it possible to write StoreFunc in
> this case?
>
> Also if we have multidimensional data (like 2D n*n matrix, n varies with
> input), can we expect Bag which contains each row as nested Bag?