You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by xa...@orange-ftgroup.com on 2009/07/30 18:51:45 UTC

Question about LOAD?

Hi there, 

I'm working with Pig 2.0. and I have the following problem:

1. One pig script writes a tuple in the hdfs using -> PigDump() 
1.1 the tuple has the following schema
(T1:tuple(t2:tuple(f1:chararray,f2:chararray),c:long))

2. Now I want to read that tuple in other pig scrip so I do the
following:
	->$ S = LOAD '$PATH' as ( T1:tuple ( t2:tuple (
f1:chararray,f2:chararray),c:long));
	->$describe S
	-> ( T1:tuple ( t2:tuple ( f1:chararray,f2:chararray),c:long))
	->$ dump S
	-> result ->$() (there is not data the tuple is empty)

What I'm doing wrong, I tried to store the data with PigDump() and
PigStorage() but in neither case I was able to load the tuples. 

I really appreciate any help

Xavier
	



Re: Question about LOAD?

Posted by Dmitriy Ryaboy <dv...@cloudera.com>.
I think you are specifying your schema incorrectly -- you don't need
the outer Tuple.
Try saving using PigStorage() and loading as follows:
Test = load '/user/test/output' as ( t1:tuple
(F1:chararray,f2:chararray),c:long);


On Thu, Jul 30, 2009 at 2:39 PM, <xa...@orange-ftgroup.com> wrote:
> Sure here is my data:
> Script1.pig->
> My tuple schema is the following:
> namekey: {group: (timestamp: chararray,endpoint: chararray),long}
> store namekey into '/user/test/output' using PigStorage();
> Data store:
> ((1233028744862,https://XX.XX.XX.XX:XX/api/XXXXXXX),1L)
> Script 2:
> Test = load '/user/test/output' as ( T1:tuple ( t2:tuple (F1:chararray,f2:chararray),c:long));
>
> Dump Test -> (,)
>
> I hope this could help,
>
> Xavier
>
>
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@cloudera.com]
> Sent: Thursday, July 30, 2009 10:31 AM
> To: pig-user@hadoop.apache.org
> Cc: pig-user@incubator.apache.org
> Subject: Re: Question about LOAD?
>
> Can you send your actual STORE and LOAD statements, including the values of variables like $PATH?
>
> On Thu, Jul 30, 2009 at 9:51 AM, <xa...@orange-ftgroup.com> wrote:
>> Hi there,
>>
>> I'm working with Pig 2.0. and I have the following problem:
>>
>> 1. One pig script writes a tuple in the hdfs using -> PigDump()
>> 1.1 the tuple has the following schema
>> (T1:tuple(t2:tuple(f1:chararray,f2:chararray),c:long))
>>
>> 2. Now I want to read that tuple in other pig scrip so I do the
>> following:
>>        ->$ S = LOAD '$PATH' as ( T1:tuple ( t2:tuple (
>> f1:chararray,f2:chararray),c:long));
>>        ->$describe S
>>        -> ( T1:tuple ( t2:tuple ( f1:chararray,f2:chararray),c:long))
>>        ->$ dump S
>>        -> result ->$() (there is not data the tuple is empty)
>>
>> What I'm doing wrong, I tried to store the data with PigDump() and
>> PigStorage() but in neither case I was able to load the tuples.
>>
>> I really appreciate any help
>>
>> Xavier
>>
>>
>>
>>
>

RE: Question about LOAD?

Posted by xa...@orange-ftgroup.com.
Sure here is my data:
Script1.pig-> 
My tuple schema is the following: 
namekey: {group: (timestamp: chararray,endpoint: chararray),long} 
store namekey into '/user/test/output' using PigStorage();
Data store:
((1233028744862,https://XX.XX.XX.XX:XX/api/XXXXXXX),1L)
Script 2:
Test = load '/user/test/output' as ( T1:tuple ( t2:tuple (F1:chararray,f2:chararray),c:long)); 

Dump Test -> (,)

I hope this could help, 

Xavier



-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvryaboy@cloudera.com] 
Sent: Thursday, July 30, 2009 10:31 AM
To: pig-user@hadoop.apache.org
Cc: pig-user@incubator.apache.org
Subject: Re: Question about LOAD?

Can you send your actual STORE and LOAD statements, including the values of variables like $PATH?

On Thu, Jul 30, 2009 at 9:51 AM, <xa...@orange-ftgroup.com> wrote:
> Hi there,
>
> I'm working with Pig 2.0. and I have the following problem:
>
> 1. One pig script writes a tuple in the hdfs using -> PigDump()
> 1.1 the tuple has the following schema
> (T1:tuple(t2:tuple(f1:chararray,f2:chararray),c:long))
>
> 2. Now I want to read that tuple in other pig scrip so I do the
> following:
>        ->$ S = LOAD '$PATH' as ( T1:tuple ( t2:tuple ( 
> f1:chararray,f2:chararray),c:long));
>        ->$describe S
>        -> ( T1:tuple ( t2:tuple ( f1:chararray,f2:chararray),c:long))
>        ->$ dump S
>        -> result ->$() (there is not data the tuple is empty)
>
> What I'm doing wrong, I tried to store the data with PigDump() and
> PigStorage() but in neither case I was able to load the tuples.
>
> I really appreciate any help
>
> Xavier
>
>
>
>

Re: Question about LOAD?

Posted by Dmitriy Ryaboy <dv...@cloudera.com>.
Can you send your actual STORE and LOAD statements, including the
values of variables like $PATH?

On Thu, Jul 30, 2009 at 9:51 AM, <xa...@orange-ftgroup.com> wrote:
> Hi there,
>
> I'm working with Pig 2.0. and I have the following problem:
>
> 1. One pig script writes a tuple in the hdfs using -> PigDump()
> 1.1 the tuple has the following schema
> (T1:tuple(t2:tuple(f1:chararray,f2:chararray),c:long))
>
> 2. Now I want to read that tuple in other pig scrip so I do the
> following:
>        ->$ S = LOAD '$PATH' as ( T1:tuple ( t2:tuple (
> f1:chararray,f2:chararray),c:long));
>        ->$describe S
>        -> ( T1:tuple ( t2:tuple ( f1:chararray,f2:chararray),c:long))
>        ->$ dump S
>        -> result ->$() (there is not data the tuple is empty)
>
> What I'm doing wrong, I tried to store the data with PigDump() and
> PigStorage() but in neither case I was able to load the tuples.
>
> I really appreciate any help
>
> Xavier
>
>
>
>

Re: Question about LOAD?

Posted by Turner Kunkel <th...@gmail.com>.
Try
S = LOAD '$PATH' USING PigStorage('[some delimiter that is definitely in the
file]') AS ( T1:tuple ( t2:tuple (
f1:chararray,f2:chararray),c:long));

I *think* without the PigStorage call it looks for tab-delimited items by
default, so maybe that will work?

On Thu, Jul 30, 2009 at 11:51 AM, <xa...@orange-ftgroup.com>wrote:

> Hi there,
>
> I'm working with Pig 2.0. and I have the following problem:
>
> 1. One pig script writes a tuple in the hdfs using -> PigDump()
> 1.1 the tuple has the following schema
> (T1:tuple(t2:tuple(f1:chararray,f2:chararray),c:long))
>
> 2. Now I want to read that tuple in other pig scrip so I do the
> following:
>        ->$ S = LOAD '$PATH' as ( T1:tuple ( t2:tuple (
> f1:chararray,f2:chararray),c:long));
>        ->$describe S
>        -> ( T1:tuple ( t2:tuple ( f1:chararray,f2:chararray),c:long))
>        ->$ dump S
>        -> result ->$() (there is not data the tuple is empty)
>
> What I'm doing wrong, I tried to store the data with PigDump() and
> PigStorage() but in neither case I was able to load the tuples.
>
> I really appreciate any help
>
> Xavier
>
>
>
>


-- 

-Turner Kunkel