You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Bill Graham <bi...@gmail.com> on 2011/07/25 23:11:25 UTC

AvroStorage fails to store when reading via PigStorage

Hi,

I'm trying to run a simple AvroStorage example to read from a tsv file via
PigStorage and write to Avro, but the job fails with the following
exception:

java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be
cast to org.apache.avro.generic.IndexedRecord
        at
org.apache.avro.generic.GenericData.getField(GenericData.java:470)
        at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:102)
        at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
        at
org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99)
        at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
        at
org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:244)
        at
org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49)
        at
org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:580)

The sample script is taken from this wiki section:
http://linkedin.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data#AvroStorage-PigsupportforAvrodata-A.Howtostoredataindifferentways
.

I'm using the pig trunk and Avro 1.6.0.

Has anyone encountered this or know what the issue is? It seems like this
use case isn't supported in the current version of AvroStorage, so it's
either a bug in the code or the documentation. The unit tests only include
tests to verify that avro data read via AvroStorage could then produce avro,
but there is no test to go from PigStorage to AvroStorage.

thanks,
Bill

Re: AvroStorage fails to store when reading via PigStorage

Posted by Bill Graham <bi...@gmail.com>.
This does in fact seem like a bug. FYI, I've created a JIRA for this and I'm
working on a patch:

https://issues.apache.org/jira/browse/PIG-2195


On Mon, Jul 25, 2011 at 2:11 PM, Bill Graham <bi...@gmail.com> wrote:

> Hi,
>
> I'm trying to run a simple AvroStorage example to read from a tsv file via
> PigStorage and write to Avro, but the job fails with the following
> exception:
>
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be
> cast to org.apache.avro.generic.IndexedRecord
>         at
> org.apache.avro.generic.GenericData.getField(GenericData.java:470)
>         at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:102)
>         at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:65)
>         at
> org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDatumWriter.java:99)
>         at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:57)
>         at
> org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:244)
>         at
> org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroRecordWriter.java:49)
>         at
> org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.java:580)
>
> The sample script is taken from this wiki section:
>
> http://linkedin.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data#AvroStorage-PigsupportforAvrodata-A.Howtostoredataindifferentways
> .
>
> I'm using the pig trunk and Avro 1.6.0.
>
> Has anyone encountered this or know what the issue is? It seems like this
> use case isn't supported in the current version of AvroStorage, so it's
> either a bug in the code or the documentation. The unit tests only include
> tests to verify that avro data read via AvroStorage could then produce avro,
> but there is no test to go from PigStorage to AvroStorage.
>
> thanks,
> Bill
>