You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jeremy Hanna <je...@gmail.com> on 2011/04/08 18:30:31 UTC

Getting errors with BinSedesTuple in my storefunc

I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc.  I store it out to hdfs and it visually looks just fine.  However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray.  I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing.  So I'm just wondering - is there any way I should convert this manually or is there something wrong?

Thanks,

Jeremy

Re: Getting errors with BinSedesTuple in my storefunc

Posted by Jeremy Hanna <je...@gmail.com>.
Thejas,

I was being dumb about it and when my UDF returned the set of data I had neglected to define the outgoing schema before sending it to the storefunc.  Consequently, it had no schema, so Pig was doing the best it could with the data.  (Thanks Dmitriy for pointing that out to me).

Thanks for the help!

Jeremy

On Apr 8, 2011, at 3:35 PM, Thejas M Nair wrote:

> Bytearray datatype also represents the ‘unkown’ type. Ie if pig does not know the type for a field, it uses the bytearray type. In such cases the actual object will not be an instance of DataByteArray.
> I am wondering if in the storefunc, you are casting an ‘unkown’ type (which happens to be a tuple), into DataByteArray.  Can you check if pig is doing the right thing by returning a Tuple in this case ? (BinSedesTuple implements Tuple interface).
> 
> 
> Thanks,
> Thejas
> 
> 
> 
> On 4/8/11 9:30 AM, "Jeremy Hanna" <je...@gmail.com> wrote:
> 
> I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc.  I store it out to hdfs and it visually looks just fine.  However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray.  I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing.  So I'm just wondering - is there any way I should convert this manually or is there something wrong?
> 
> Thanks,
> 
> Jeremy 
> 
> 


Re: Getting errors with BinSedesTuple in my storefunc

Posted by Thejas M Nair <te...@yahoo-inc.com>.
Bytearray datatype also represents the 'unkown' type. Ie if pig does not know the type for a field, it uses the bytearray type. In such cases the actual object will not be an instance of DataByteArray.
I am wondering if in the storefunc, you are casting an 'unkown' type (which happens to be a tuple), into DataByteArray.  Can you check if pig is doing the right thing by returning a Tuple in this case ? (BinSedesTuple implements Tuple interface).


Thanks,
Thejas



On 4/8/11 9:30 AM, "Jeremy Hanna" <je...@gmail.com> wrote:

I am going through a lot of processing with my data and then I reformat it to go back into my data store using the storefunc.  I store it out to hdfs and it visually looks just fine.  However when I try to persist it, I'm getting an exception that it can't cast one of the values from org.apache.pig.data.BinSedesTuple to org.apache.pig.data.DataByteArray.  I had been assuming that the value would have been a DataByteArray (in my storefunc) and it looks from the javadocs of BinSedesTuple that it's a type only used for intermediate processing.  So I'm just wondering - is there any way I should convert this manually or is there something wrong?

Thanks,

Jeremy