You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/06/27 18:28:47 UTC

[jira] [Assigned] (PIG-2144) ClassCastException when using IsEmpty(DIFF())

     [ https://issues.apache.org/jira/browse/PIG-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-2144:
-----------------------------------

    Assignee: Thejas M Nair

> ClassCastException when using IsEmpty(DIFF()) 
> ----------------------------------------------
>
>                 Key: PIG-2144
>                 URL: https://issues.apache.org/jira/browse/PIG-2144
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Mitesh Singh Jat
>            Assignee: Thejas M Nair
>             Fix For: 0.9.0
>
>
> I have following input <name>:<nickname>, for which I want to find records where name is different from nickname.
> {code:title=input/name_nickname.txt}
> Bharat:Bharat
> Amita:Amita
> Mitesh:Mitesh
> Reenu:Anshu
> Shikha:Shikhu
> Shilpa:Shilpi
> {code}
> I have following script to find records where name is different from nickname.
> {code:title=isEmpty_diff.pig}
> A = LOAD 'input/name_nickname.txt' using PigStorage(':');
> B = FILTER A BY NOT IsEmpty(DIFF($0, $1));
> DUMP B;
> {code}
> The above pig script works with older pig versions (e.g. 0.8.0 (r1043805)) and gives following output
> {code:title=output of isEmpty_diff.pig}
> (Reenu,Anshu)
> (Shikha,Shikhu)
> (Shilpa,Shilpi)
> {code}
> However, the above pig script (isEmpty_diff.pig) fails on Pig 0.9 (e.g. 0.9.0.1105251322 (r1127671)) and newer version of Pig 0.8 (e.g. version 0.8.0.1105131316 (r1102885)) , with ClassCastException
> {code:title=ClassCastException}
> java.lang.ClassCastException: org.apache.pig.data.DefaultDataBag cannot be cast to java.lang.Boolean
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:75)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:318)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.processInput(POUserFunc.java:159)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:184)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:269)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:676)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:336)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:210)
> {code}
> As a workaround, I used the following pig script.
> {code:titlee=isEmpty_diff2.pig}
> A = LOAD 'input/name_nickname.txt' using PigStorage(':');
> --B = FILTER A BY NOT IsEmpty(DIFF($0, $1));
> B1 = FOREACH A GENERATE $0, $1, DIFF($0, $1);
> B2 = FILTER B1 BY NOT IsEmpty($2);
> B = FOREACH B2 GENERATE $0, $1;
> DUMP B;
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira