You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/01/06 23:56:47 UTC

[jira] Commented: (PIG-1787) Error in logical plan generated

    [ https://issues.apache.org/jira/browse/PIG-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978554#action_12978554 ] 

Daniel Dai commented on PIG-1787:
---------------------------------

Simplified test case:
{code}
a = load '1.txt' as (a0, a1);
b = group a by a0;
c = foreach b generate group as c0, COUNT(a) as c1;
d = order c by c1 parallel 2;
e = limit d 10;
f = join e by c0, a by a0;
dump f;
{code}
1.txt:
1       1
1       2

Error message:
Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Long
        at org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:84)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:113)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:262)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:255)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)

> Error in logical plan generated
> -------------------------------
>
>                 Key: PIG-1787
>                 URL: https://issues.apache.org/jira/browse/PIG-1787
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.0
>            Reporter: Anitha Raju
>            Assignee: Daniel Dai
>         Attachments: PIG-1787-1.patch
>
>
> Here is a sample pig script:
> set default_parallel 2
> ALLDATA = load 'sample.txt' using PigStorage() as (id, spaceid, type, pcid);
> C1 = filter ALLDATA by (type == 'p' and
>                    (spaceid == '1196250013'
>                     or spaceid == '1196250024'
>                     or spaceid == '1196250011'));
> C2 = group C1 by pcid;
> C3 = foreach C2 generate flatten(group) as (pc_id), COUNT(C1) as tot;
> C4 = order C3 by tot desc;
> C5 = limit C4 3;
> C6 = join C5 by pc_id, C1 by pcid;
> dump C6;
> sample.txt:
> 1       1196250013      p       1234
> 2       1196250024      p       2314
> 3       1196250011      t       1111
> 4       1111111111      p       1231
> 5       1196250013      p       1254
> 6       1196250024      p       9007
> This fails with the error 
> java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableLongWritable, recieved
> org.apache.pig.impl.io.NullableBytesWritable
> when both pc_id and pcid are of type bytearray.
> The script seems to work when 
> 	a) replicated join is substituted in the place of the regular join 
> 	b) pcid is cast to long in the loader 
> 	c) doing a dump of any statement before C6
> 	d) setting default_parallel to 1 or removing it.
> 	
> One possible cause seems to be with the logical plan generation during the projection operation in C4 as can be observed from the describe statement. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.