You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2010/07/20 00:23:05 UTC

Re: Pig filter by fails at backend , what am i doing wrong?

Seems like ressurrected
 *PIG-550 <https://issues.apache.org/jira/browse/PIG-550>*


On Mon, Jul 19, 2010 at 3:04 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

>
> PPS.  the pig version is 0.7.0
>
> Thanks.
>
> On Mon, Jul 19, 2010 at 3:02 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> i guess i need to add that contentRatings in IMP_F2 is a bag of tuples
>> (mapped so by load function).
>>
>> On Mon, Jul 19, 2010 at 3:00 PM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> I would greatly appreciate somebody's help with the following pig error
>>> during MR
>>>
>>> all mappers fail with the following stack trace
>>>
>>> java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.pig.data.Tuple
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POIsNull.getNext(POIsNull.java:152)
>>>
>>>
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.PONot.getNext(PONot.java:71)
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POAnd.getNext(POAnd.java:67)
>>>
>>>
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
>>>
>>>
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLimit.getNext(POLimit.java:85)
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:272)
>>>
>>>
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:255)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:232)
>>>
>>>
>>>
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:227)
>>> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:52)
>>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>
>>>
>>>
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>
>>>
>>>
>>>
>>>
>>> the pig script fragment causing this is as follows :
>>>
>>>
>>>
>>> IMP_F2 = foreach IMP_F1 generate ... , FLATTEN(contentRatings) as contentRating;
>>> IMP_F3 = filter IMP_F2 by contentRating is not null and contentRating.vendorId==1
>>>
>>> if i remove IMP_F3 line then the job goes thru but adding IMP_F3 filtering causes this.
>>>
>>>
>>>
>>> describe IMP_F2 produces
>>>
>>> IMP_F2: {... ,contentRating: (vendorId: int, ... ), ... }
>>>
>>>
>>> i also tried casts like 'filter by ... (int)(contentRating.vendorId)==1 which did not change anything.
>>>
>>> Any ideas for workaround are appreciated.
>>>
>>>
>>>
>>> Thanks in advance.
>>> -Dmitriy
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>