You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by "Habermaas, William" <Wi...@fatwire.com> on 2009/04/09 18:06:51 UTC
Question about M/R Key/Values
Hi,
I am just starting to use Pig and having problems with my mappers
failing with:
2009-04-09 11:55:40,415 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher -
Error message from task (map)
task_200903301141_0015_m_000000java.lang.ClassCastException:
java.lang.Long
Even though I am using a chararray for everything the mapper seems to
want a long.
The following is a fragment of my script.
A = FOREACH impressions GENERATE $11;
DESCRIBE A;
A: {screensize: chararray}
STORE A INTO 'results' USING PigStorage();
What am I doing wrong?
Thanks for your help.
Bill
RE: Question about M/R Key/Values
Posted by Bill Habermaas <bi...@habermaas.us>.
Here is the script. Input is supplied from a custom load class which passes
tuples containing chararray items. The input file is very large, about 150K
tuples so I cannot provide it. I'm sure there is a simple thing I am doing
wrong but it is not obvious to me.
register C:\workspaceTestprograms\PigAnalyticsLoad\myPig.jar
register C:\workspaceTestprograms\PigAnalyticsLoad\myAnalytics.jar
A = LOAD 'hdfs://linux:9000/testdata/data.txt'
USING PigAnalyticsLoad()
AS (agent:chararray, colors:chararray, host:chararray,
imagereferer:chararray, ip:chararray, js:chararray, newvisitor:chararray,
objectid:chararray, objectname:chararray, objecttype:chararray,
referer:chararray, screensize:chararray,
sessioncreationtime:chararray, sessionid:chararray, sitename:chararray,
timestamp:chararray, timezone:chararray,
userid:chararray, visitorid:chararray);
B = FOREACH A GENERATE $11;
C = FILTER B BY screensize is not null;
D = GROUP C BY screensize;
DESCRIBE D;
E = LIMIT D 13;
DUMP E;
The console log follows:
Launching the job!
Using the configuration from C:\Analytics2.5\workspaceTestprograms\pig\conf
2009-04-09 14:50:09,762 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: hdfs://10.120.12.103:9000
2009-04-09 14:50:10,668 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to map-reduce job tracker at: hdfs://10.120.12.103:9001
D: {group: chararray,C: {screensize: chararray}}
2009-04-09 14:50:12,965 [Thread-7] WARN org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-04-09 14:50:17,965 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaunch
er - 0% complete
2009-04-09 14:50:52,965 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaunch
er - Map reduce job failed
2009-04-09 14:50:52,965 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLaunch
er - Job failed!
2009-04-09 14:50:52,965 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher -
Error message from task (map)
task_200903301141_0025_m_000000java.lang.ClassCastException: java.lang.Long
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POCast.getNext(POCast.java:641)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.processPlan(POForEach.java:254)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:180)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:169)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POFilter.getNext(POFilter.java:95)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POLocalRearrange.getNext(POLocalRearrange.java:192)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runP
ipeline(PigMapBase.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(
PigMapBase.java:158)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Ma
p.map(PigMapReduce.java:81)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)
2009-04-09 14:50:52,965 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher -
Error message from task (map)
task_200903301141_0025_m_000000java.lang.ClassCastException: java.lang.Long
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POCast.getNext(POCast.java:641)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.processPlan(POForEach.java:254)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:180)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:169)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POFilter.getNext(POFilter.java:95)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POLocalRearrange.getNext(POLocalRearrange.java:192)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runP
ipeline(PigMapBase.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(
PigMapBase.java:158)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Ma
p.map(PigMapReduce.java:81)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)
2009-04-09 14:50:53,012 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher -
Error message from task (map)
task_200903301141_0025_m_000000java.lang.ClassCastException: java.lang.Long
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POCast.getNext(POCast.java:641)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.processPlan(POForEach.java:254)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:180)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:169)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POFilter.getNext(POFilter.java:95)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POLocalRearrange.getNext(POLocalRearrange.java:192)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runP
ipeline(PigMapBase.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(
PigMapBase.java:158)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Ma
p.map(PigMapReduce.java:81)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)
2009-04-09 14:50:53,074 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher -
Error message from task (map)
task_200903301141_0025_m_000000java.lang.ClassCastException: java.lang.Long
java.io.IOException: Unable to open iterator for alias: E [Job terminated
with anomalous status FAILED]
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperat
ors.POCast.getNext(POCast.java:641)
at org.apache.pig.PigServer.openIterator(PigServer.java:389)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.processPlan(POForEach.java:254)
at
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:180)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.
java:178)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POForEach.getNext(POForEach.java:169)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
at org.apache.pig.Main.main(Main.java:242)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POFilter.getNext(POFilter.java:95)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
.processInput(PhysicalOperator.java:226)
... 6 more
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperat
ors.POLocalRearrange.getNext(POLocalRearrange.java:192)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runP
ipeline(PigMapBase.java:170)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(
PigMapBase.java:158)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Ma
p.map(PigMapReduce.java:81)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2198)
-----Original Message-----
From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
Sent: Thursday, April 09, 2009 2:17 PM
To: pig-user@hadoop.apache.org
Subject: RE: Question about M/R Key/Values
Please provide the fill script as well as the full stack trace. From the
fragment you provided, it is hard to tell what is going on.
Olga
> -----Original Message-----
> From: Habermaas, William [mailto:William.Habermaas@fatwire.com]
> Sent: Thursday, April 09, 2009 9:07 AM
> To: pig-user@hadoop.apache.org
> Subject: Question about M/R Key/Values
>
> Hi,
>
>
>
> I am just starting to use Pig and having problems with my
> mappers failing with:
>
>
>
> 2009-04-09 11:55:40,415 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.L
> auncher - Error message from task (map)
> task_200903301141_0015_m_000000java.lang.ClassCastException:
> java.lang.Long
>
>
>
> Even though I am using a chararray for everything the mapper
> seems to want a long.
>
>
>
> The following is a fragment of my script.
>
>
>
> A = FOREACH impressions GENERATE $11;
>
> DESCRIBE A;
>
> A: {screensize: chararray}
>
> STORE A INTO 'results' USING PigStorage();
>
>
>
> What am I doing wrong?
>
>
>
> Thanks for your help.
>
>
>
> Bill
>
>
RE: Question about M/R Key/Values
Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Please provide the fill script as well as the full stack trace. From the
fragment you provided, it is hard to tell what is going on.
Olga
> -----Original Message-----
> From: Habermaas, William [mailto:William.Habermaas@fatwire.com]
> Sent: Thursday, April 09, 2009 9:07 AM
> To: pig-user@hadoop.apache.org
> Subject: Question about M/R Key/Values
>
> Hi,
>
>
>
> I am just starting to use Pig and having problems with my
> mappers failing with:
>
>
>
> 2009-04-09 11:55:40,415 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.L
> auncher - Error message from task (map)
> task_200903301141_0015_m_000000java.lang.ClassCastException:
> java.lang.Long
>
>
>
> Even though I am using a chararray for everything the mapper
> seems to want a long.
>
>
>
> The following is a fragment of my script.
>
>
>
> A = FOREACH impressions GENERATE $11;
>
> DESCRIBE A;
>
> A: {screensize: chararray}
>
> STORE A INTO 'results' USING PigStorage();
>
>
>
> What am I doing wrong?
>
>
>
> Thanks for your help.
>
>
>
> Bill
>
>