You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Alex Rovner <al...@gmail.com> on 2012/04/11 17:29:29 UTC

Union causes ClassCastException

I am facing a similar issues that is described in
https://issues.apache.org/jira/browse/PIG-2493

I am running on latest trunk where the fix for PIG-2493 was committed yet I
am still getting the following exception when attempting to union two
relations.

Schemas

a: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id:
int,order_id: chararray,order_sales: double,delta: long,uuid:
chararray,ctid: long,advertiser_id: int,client_tid:
int,conversion_provenance: chararray}

b: {timestamp: chararray,date_time:
chararray,iponweb_conversions::conversion_type: chararray,channel_type:
chararray,iponweb_conversions::campaign_id:
int,iponweb_conversions::adgroup_id: int,order_id: chararray,order_sales:
double,delta: long,iponweb_conversions::uuid: chararray,ctid:
long,iponweb_conversions::advertiser_id: int,client_tid:
long,conversion_provenance: chararray}

Resulting union schema:

c: {timestamp: chararray,date_time: chararray,conversion_type:
chararray,channel_type: chararray,campaign_id: int,adgroup_id:
int,order_id: chararray,order_sales: double,delta: long,uuid:
chararray,ctid: long,advertiser_id: int,client_tid:
long,conversion_provenance: chararray}

Relevant script portion:


describe a;
describe b;
c = UNION a, b;

describe c;

-------------------------

The job works without any issues if I store "a" and "b" without performing
the union. Whats even more interesting is that If I save "a" and "b" to a
file using PigStorage then read them back in another script and Union the
data it works as expected. To me this sounds like the plan is at fault.

Exception:

java.lang.ClassCastException: java.lang.String cannot be cast to
java.lang.Integer
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:432)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:330)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
        at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:165)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
        at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)

I can not attach the script since it contains proprietary UDF's etc

I might be able to share it with an individual developer.

Thanks
Alex Rovner

Re: Union causes ClassCastException

Posted by Alex Rovner <al...@gmail.com>.
Any takers?

On Wed, Apr 11, 2012 at 11:29 AM, Alex Rovner <al...@gmail.com> wrote:

> I am facing a similar issues that is described in
> https://issues.apache.org/jira/browse/PIG-2493
>
> I am running on latest trunk where the fix for PIG-2493 was committed yet
> I am still getting the following exception when attempting to union two
> relations.
>
> Schemas
>
> a: {timestamp: chararray,date_time: chararray,conversion_type:
> chararray,channel_type: chararray,campaign_id: int,adgroup_id:
> int,order_id: chararray,order_sales: double,delta: long,uuid:
> chararray,ctid: long,advertiser_id: int,client_tid:
> int,conversion_provenance: chararray}
>
> b: {timestamp: chararray,date_time:
> chararray,iponweb_conversions::conversion_type: chararray,channel_type:
> chararray,iponweb_conversions::campaign_id:
> int,iponweb_conversions::adgroup_id: int,order_id: chararray,order_sales:
> double,delta: long,iponweb_conversions::uuid: chararray,ctid:
> long,iponweb_conversions::advertiser_id: int,client_tid:
> long,conversion_provenance: chararray}
>
> Resulting union schema:
>
> c: {timestamp: chararray,date_time: chararray,conversion_type:
> chararray,channel_type: chararray,campaign_id: int,adgroup_id:
> int,order_id: chararray,order_sales: double,delta: long,uuid:
> chararray,ctid: long,advertiser_id: int,client_tid:
> long,conversion_provenance: chararray}
>
> Relevant script portion:
>
>
> describe a;
> describe b;
> c = UNION a, b;
>
> describe c;
>
> -------------------------
>
> The job works without any issues if I store "a" and "b" without performing
> the union. Whats even more interesting is that If I save "a" and "b" to a
> file using PigStorage then read them back in another script and Union the
> data it works as expected. To me this sounds like the plan is at fault.
>
> Exception:
>
> java.lang.ClassCastException: java.lang.String cannot be cast to
> java.lang.Integer
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:432)
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:330)
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>         at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNext(POUnion.java:165)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
>
> I can not attach the script since it contains proprietary UDF's etc
>
> I might be able to share it with an individual developer.
>
> Thanks
> Alex Rovner
>
>
>