You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Doug Daniels <dd...@mortardata.com> on 2011/07/26 19:02:11 UTC
Exception running penny data sampler
Hi,
I'm trying to run the data sampler tool from the penny library, and am getting a ClassNotFoundException for a netty class. I'm using the trunk version of pig, with the patch from PIG-2013 applied.
I'm running a simple script that uses pig test data from test/org/apache/pig/test/data/InputFiles/jsTst1.txt :
x = LOAD 'jsTst1.txt' USING PigStorage('\t');
x_filtered = FILTER x BY (int)$1 > 100;
STORE x_filtered INTO 'jsTst1Filtered';
To run it, I tried the syntax from https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I was getting a ClassNotFoundException on org.jboss.netty.channel.ChannelFactory before the job even started running. I added the netty-3.2.2.Final.jar from pig's ivy libs to the -cp list, which fixed that ClassNotFoundException, but left me with a new one after the job started:
11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to write to.
11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message
Error: java.lang.ClassNotFoundException: org.jboss.netty.channel.SimpleChannelHandler
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(MonitorAgentHarness.java:229)
at org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:61)
at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:72)
at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:37)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:258)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Should I be running penny in a different way?
Thanks,
Doug
Re: Exception running penny data sampler
Posted by Daniel Dai <da...@hortonworks.com>.
It is PIG-2199. Patch already committed.
Daniel
On Wed, Aug 10, 2011 at 11:50 AM, Alan Gates <ga...@hortonworks.com> wrote:
>
> On Jul 30, 2011, at 7:18 AM, Doug Daniels wrote:
>
>> I added the one liner to build.xml to include netty and that fixed the
>> problem. Should I create a JIRA for that?
>
> Yes, please.
>
> Alan.
>
>>
>
>
Re: Exception running penny data sampler
Posted by Alan Gates <ga...@hortonworks.com>.
On Jul 30, 2011, at 7:18 AM, Doug Daniels wrote:
> I added the one liner to build.xml to include netty and that fixed the
> problem. Should I create a JIRA for that?
Yes, please.
Alan.
>
Re: Exception running penny data sampler
Posted by Doug Daniels <dd...@mortardata.com>.
I added the one liner to build.xml to include netty and that fixed the
problem. Should I create a JIRA for that?
That got me through printing out one sample row, but then I got another
exception:
org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received
a bytearray from the UDF. Cannot determine how to convert the bytearray to
int.
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpera
tors.POCast.getNext(POCast.java:164)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperato
r.getNext(PhysicalOperator.java:328)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpera
tors.GreaterThanExpr.getNext(GreaterThanExpr.java:72)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpera
tors.POFilter.getNext(POFilter.java:148)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperato
r.processInput(PhysicalOperator.java:290)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpera
tors.POForEach.getNext(POForEach.java:233)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.runPipeline(PigGenericMapBase.java:267)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.map(PigGenericMapBase.java:262)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Does anyone know what might cause this?
Thanks,
Doug
On 7/28/11 2:15 AM, "Benjamin Reed" <br...@apache.org> wrote:
>the problem is that the netty classes need to be accessible to the
>tasks running in hadoop. i think the netty classes should be jarred
>into the penny.jar, so that they are distributed properly. unless
>someone else has a better idea.
>
>ben
>
>On Tue, Jul 26, 2011 at 10:02 AM, Doug Daniels <dd...@mortardata.com>
>wrote:
>> Hi,
>>
>> I'm trying to run the data sampler tool from the penny library, and am
>>getting a ClassNotFoundException for a netty class. I'm using the trunk
>>version of pig, with the patch from PIG-2013 applied.
>>
>> I'm running a simple script that uses pig test data from
>>test/org/apache/pig/test/data/InputFiles/jsTst1.txt :
>>
>> x = LOAD 'jsTst1.txt' USING PigStorage('\t');
>> x_filtered = FILTER x BY (int)$1 > 100;
>> STORE x_filtered INTO 'jsTst1Filtered';
>>
>> To run it, I tried the syntax from
>>https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I
>>was getting a ClassNotFoundException on
>>org.jboss.netty.channel.ChannelFactory before the job even started
>>running. I added the netty-3.2.2.Final.jar from pig's ivy libs to the
>>-cp list, which fixed that ClassNotFoundException, but left me with a
>>new one after the job started:
>>
>>
>> 11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to
>>write to.
>>
>> 11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message
>>
>> Error: java.lang.ClassNotFoundException:
>>org.jboss.netty.channel.SimpleChannelHandler
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>
>> at java.lang.ClassLoader.defineClass1(Native Method)
>>
>> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>>
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>>
>> at
>>java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>>
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>>
>> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>
>> at
>>org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(Monito
>>rAgentHarness.java:229)
>>
>> at
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:6
>>1)
>>
>> at
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:7
>>2)
>>
>> at
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:3
>>7)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>>rators.POUserFunc.getNext(POUserFunc.java:216)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>>rators.POUserFunc.getNext(POUserFunc.java:258)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.getNext(PhysicalOperator.java:316)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.processPlan(POForEach.java:332)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.getNext(POForEach.java:284)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.processInput(PhysicalOperator.java:290)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POFilter.getNext(POFilter.java:95)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.processInput(PhysicalOperator.java:290)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.getNext(POForEach.java:233)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.runPipeline(PigGenericMapBase.java:267)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.map(PigGenericMapBase.java:262)
>>
>> at
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.map(PigGenericMapBase.java:64)
>>
>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>> Should I be running penny in a different way?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>
Re: Exception running penny data sampler
Posted by Benjamin Reed <br...@apache.org>.
the problem is that the netty classes need to be accessible to the
tasks running in hadoop. i think the netty classes should be jarred
into the penny.jar, so that they are distributed properly. unless
someone else has a better idea.
ben
On Tue, Jul 26, 2011 at 10:02 AM, Doug Daniels <dd...@mortardata.com> wrote:
> Hi,
>
> I'm trying to run the data sampler tool from the penny library, and am getting a ClassNotFoundException for a netty class. I'm using the trunk version of pig, with the patch from PIG-2013 applied.
>
> I'm running a simple script that uses pig test data from test/org/apache/pig/test/data/InputFiles/jsTst1.txt :
>
> x = LOAD 'jsTst1.txt' USING PigStorage('\t');
> x_filtered = FILTER x BY (int)$1 > 100;
> STORE x_filtered INTO 'jsTst1Filtered';
>
> To run it, I tried the syntax from https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I was getting a ClassNotFoundException on org.jboss.netty.channel.ChannelFactory before the job even started running. I added the netty-3.2.2.Final.jar from pig's ivy libs to the -cp list, which fixed that ClassNotFoundException, but left me with a new one after the job started:
>
>
> 11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to write to.
>
> 11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message
>
> Error: java.lang.ClassNotFoundException: org.jboss.netty.channel.SimpleChannelHandler
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> at java.lang.ClassLoader.defineClass1(Native Method)
>
> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>
> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>
> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> at org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(MonitorAgentHarness.java:229)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:61)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:72)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:37)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:258)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Should I be running penny in a different way?
>
> Thanks,
> Doug
>
>
>
>