You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Craig Macdonald <cr...@dcs.gla.ac.uk> on 2008/12/02 17:53:35 UTC

Re: Invalid size 0 for a tuple - is this a bug?[MESSAGE NOT SCANNED]

Sorry, should have said, this is types branch.

C

Olga Natkovich wrote:
> Hi Craig,
>
> I assume you are running with code on the trunk or official release,
> right? Can you try the same with code on the types branch?
>
> Thanks,
>
> Olga 
>
>   
>> -----Original Message-----
>> From: Craig Macdonald [mailto:craigm@dcs.gla.ac.uk] 
>> Sent: Tuesday, December 02, 2008 7:53 AM
>> To: pig-user@hadoop.apache.org
>> Subject: Invalid size 0 for a tuple - is this a bug?
>>
>> I am using Pig, and trying to join two files, and having the 
>> following problem (attempted the problem in two different ways now)
>>
>>  ERROR
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.L
>> auncher - Error message from task (map)
>> task_200812021505_0002_m_000017java.io.IOException: Invalid 
>> size 0 for a tuple
>>         at
>> org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWrite
>> r.java:57)
>>         at
>> org.apache.pig.data.DataReaderWriter.readDatum(DataReaderWrite
>> r.java:62)
>>         at 
>> org.apache.pig.builtin.BinStorage.getNext(BinStorage.java:90)
>>         at
>> org.apache.pig.backend.executionengine.PigSlice.next(PigSlice.
>> java:101)
>>         at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.S
>> liceWrapper$1.next(SliceWrapper.java:157)
>>         at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.S
>> liceWrapper$1.next(SliceWrapper.java:133)
>>         at
>> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapT
>> ask.java:165)
>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>>         at
>> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
>>
>> My script has the following form:
>>
>> QsCounts = load 'fileI1'  as (q,  qC);
>> QCs = load 'fileI2'  using PigStorage('\t') as (q, d, cc); 
>> QCsGROUP = GROUP QCs by q; GQs = FOREACH QCsGROUP GENERATE 
>> flatten($0) as q; GQsRecords = JOIN GQs by q, QsCounts by q; 
>> GQsRecordGROUP = GROUP GQsRecords by query; GQC = FOREACH 
>> GQsRecordGROUP GENERATE flatten($0) as q, $1.cc as count; 
>> store GQC into 'fileO'  using PigStorage('\t');
>>
>> Is this a bug, or a user error,  do you need to know 
>> something else about the data?
>>
>> Craig
>>
>>