You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Xavier Stevens <xs...@mozilla.com> on 2012/09/04 17:52:33 UTC

UNION with bytearray

I'm trying to do a UNION on two datasets with identical schemas 
(k:bytearray, v:chararray). When using the UNION operator like so:

combined_data = UNION dataset1, dataset2;

I get the following error:

java.lang.RuntimeException: Unexpected data type java.util.ArrayList found in stream. Note only standard Pig type is supported when you output from UDF/LoadFunc

Everything works fine if I store the two datasets separately without the 
union.

This feels like a bug, but am I doing something wrong here?

Cheers,


-Xavier

Re: UNION with bytearray

Posted by Xavier Stevens <xs...@mozilla.com>.
Hey Thejas,

After the union I just try to store using Elephant Bird's sequence file 
storage with a BytesWritable key and Text value.

I'll open up a JIRA ticket with the details.

Cheers,

-Xavier

On 9/11/12 9:37 AM, Thejas Nair wrote:
> This sounds like a bug.
>
> What do you have after the union ?
> Can you try to reproduce this with a script/data that you can share ?
> If you can open a jira with details, that would be even better.
>
> Thanks,
> Thejas
>
>
> On 9/4/12 8:52 AM, Xavier Stevens wrote:
>> I'm trying to do a UNION on two datasets with identical schemas
>> (k:bytearray, v:chararray). When using the UNION operator like so:
>>
>> combined_data = UNION dataset1, dataset2;
>>
>> I get the following error:
>>
>> java.lang.RuntimeException: Unexpected data type java.util.ArrayList
>> found in stream. Note only standard Pig type is supported when you
>> output from UDF/LoadFunc
>>
>> Everything works fine if I store the two datasets separately without the
>> union.
>>
>> This feels like a bug, but am I doing something wrong here?
>>
>> Cheers,
>>
>>
>> -Xavier
>>
>


Re: UNION with bytearray

Posted by Thejas Nair <th...@hortonworks.com>.
This sounds like a bug.

What do you have after the union ?
Can you try to reproduce this with a script/data that you can share ?
If you can open a jira with details, that would be even better.

Thanks,
Thejas


On 9/4/12 8:52 AM, Xavier Stevens wrote:
> I'm trying to do a UNION on two datasets with identical schemas
> (k:bytearray, v:chararray). When using the UNION operator like so:
>
> combined_data = UNION dataset1, dataset2;
>
> I get the following error:
>
> java.lang.RuntimeException: Unexpected data type java.util.ArrayList
> found in stream. Note only standard Pig type is supported when you
> output from UDF/LoadFunc
>
> Everything works fine if I store the two datasets separately without the
> union.
>
> This feels like a bug, but am I doing something wrong here?
>
> Cheers,
>
>
> -Xavier
>