You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/01/06 23:56:47 UTC
[jira] Commented: (PIG-1787) Error in logical plan generated
[ https://issues.apache.org/jira/browse/PIG-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978554#action_12978554 ]
Daniel Dai commented on PIG-1787:
---------------------------------
Simplified test case:
{code}
a = load '1.txt' as (a0, a1);
b = group a by a0;
c = foreach b generate group as c0, COUNT(a) as c1;
d = order c by c1 parallel 2;
e = limit d 10;
f = join e by c0, a by a0;
dump f;
{code}
1.txt:
1 1
1 2
Error message:
Caused by: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Long
at org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:84)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:113)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:262)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:255)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:58)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> Error in logical plan generated
> -------------------------------
>
> Key: PIG-1787
> URL: https://issues.apache.org/jira/browse/PIG-1787
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Anitha Raju
> Assignee: Daniel Dai
> Attachments: PIG-1787-1.patch
>
>
> Here is a sample pig script:
> set default_parallel 2
> ALLDATA = load 'sample.txt' using PigStorage() as (id, spaceid, type, pcid);
> C1 = filter ALLDATA by (type == 'p' and
> (spaceid == '1196250013'
> or spaceid == '1196250024'
> or spaceid == '1196250011'));
> C2 = group C1 by pcid;
> C3 = foreach C2 generate flatten(group) as (pc_id), COUNT(C1) as tot;
> C4 = order C3 by tot desc;
> C5 = limit C4 3;
> C6 = join C5 by pc_id, C1 by pcid;
> dump C6;
> sample.txt:
> 1 1196250013 p 1234
> 2 1196250024 p 2314
> 3 1196250011 t 1111
> 4 1111111111 p 1231
> 5 1196250013 p 1254
> 6 1196250024 p 9007
> This fails with the error
> java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableLongWritable, recieved
> org.apache.pig.impl.io.NullableBytesWritable
> when both pc_id and pcid are of type bytearray.
> The script seems to work when
> a) replicated join is substituted in the place of the regular join
> b) pcid is cast to long in the loader
> c) doing a dump of any statement before C6
> d) setting default_parallel to 1 or removing it.
>
> One possible cause seems to be with the logical plan generation during the projection operation in C4 as can be observed from the describe statement.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.