You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Doug Daniels <dd...@mortardata.com> on 2011/07/26 19:02:11 UTC

Exception running penny data sampler

Hi,

I'm trying to run the data sampler tool from the penny library, and am getting a ClassNotFoundException for a netty class.  I'm using the trunk version of pig, with the patch from PIG-2013 applied.

I'm running a simple script that uses pig test data from test/org/apache/pig/test/data/InputFiles/jsTst1.txt :

    x = LOAD 'jsTst1.txt' USING PigStorage('\t');
    x_filtered = FILTER x BY (int)$1 > 100;
    STORE x_filtered INTO 'jsTst1Filtered';

To run it, I tried the syntax from https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I was getting a ClassNotFoundException on org.jboss.netty.channel.ChannelFactory before the job even started running.  I added the netty-3.2.2.Final.jar from pig's ivy libs to the -cp list, which fixed that ClassNotFoundException, but left me with a new one after the job started:


11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to write to.

11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message

Error: java.lang.ClassNotFoundException: org.jboss.netty.channel.SimpleChannelHandler

at java.net.URLClassLoader$1.run(URLClassLoader.java:202)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:306)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)

at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

at java.lang.ClassLoader.defineClass1(Native Method)

at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)

at java.lang.ClassLoader.defineClass(ClassLoader.java:615)

at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)

at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)

at java.net.URLClassLoader.access$000(URLClassLoader.java:58)

at java.net.URLClassLoader$1.run(URLClassLoader.java:197)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:190)

at java.lang.ClassLoader.loadClass(ClassLoader.java:306)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)

at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

at org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(MonitorAgentHarness.java:229)

at org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:61)

at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:72)

at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:37)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:258)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)

at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)

at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)

at org.apache.hadoop.mapred.Child.main(Child.java:170)

Should I be running penny in a different way?

Thanks,
Doug




Re: Exception running penny data sampler

Posted by Daniel Dai <da...@hortonworks.com>.
It is PIG-2199. Patch already committed.

Daniel

On Wed, Aug 10, 2011 at 11:50 AM, Alan Gates <ga...@hortonworks.com> wrote:
>
> On Jul 30, 2011, at 7:18 AM, Doug Daniels wrote:
>
>> I added the one liner to build.xml to include netty and that fixed the
>> problem. Should I create a JIRA for that?
>
> Yes, please.
>
> Alan.
>
>>
>
>

Re: Exception running penny data sampler

Posted by Alan Gates <ga...@hortonworks.com>.
On Jul 30, 2011, at 7:18 AM, Doug Daniels wrote:

> I added the one liner to build.xml to include netty and that fixed the
> problem. Should I create a JIRA for that?

Yes, please.

Alan.

> 


Re: Exception running penny data sampler

Posted by Doug Daniels <dd...@mortardata.com>.
I added the one liner to build.xml to include netty and that fixed the
problem. Should I create a JIRA for that?

That got me through printing out one sample row, but then I got another
exception:

org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received
a bytearray from the UDF. Cannot determine how to convert the bytearray to
int.
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpera
tors.POCast.getNext(POCast.java:164)
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperato
r.getNext(PhysicalOperator.java:328)
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpera
tors.GreaterThanExpr.getNext(GreaterThanExpr.java:72)
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpera
tors.POFilter.getNext(POFilter.java:148)
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperato
r.processInput(PhysicalOperator.java:290)
	at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpera
tors.POForEach.getNext(POForEach.java:233)
	at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.runPipeline(PigGenericMapBase.java:267)
	at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.map(PigGenericMapBase.java:262)
	at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapB
ase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.Child.main(Child.java:170)


Does anyone know what might cause this?

Thanks,
Doug

On 7/28/11 2:15 AM, "Benjamin Reed" <br...@apache.org> wrote:

>the problem is that the netty classes need to be accessible to the
>tasks running in hadoop. i think the netty classes should be jarred
>into the penny.jar, so that they are distributed properly. unless
>someone else has a better idea.
>
>ben
>
>On Tue, Jul 26, 2011 at 10:02 AM, Doug Daniels <dd...@mortardata.com>
>wrote:
>> Hi,
>>
>> I'm trying to run the data sampler tool from the penny library, and am
>>getting a ClassNotFoundException for a netty class.  I'm using the trunk
>>version of pig, with the patch from PIG-2013 applied.
>>
>> I'm running a simple script that uses pig test data from
>>test/org/apache/pig/test/data/InputFiles/jsTst1.txt :
>>
>>    x = LOAD 'jsTst1.txt' USING PigStorage('\t');
>>    x_filtered = FILTER x BY (int)$1 > 100;
>>    STORE x_filtered INTO 'jsTst1Filtered';
>>
>> To run it, I tried the syntax from
>>https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I
>>was getting a ClassNotFoundException on
>>org.jboss.netty.channel.ChannelFactory before the job even started
>>running.  I added the netty-3.2.2.Final.jar from pig's ivy libs to the
>>-cp list, which fixed that ClassNotFoundException, but left me with a
>>new one after the job started:
>>
>>
>> 11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to
>>write to.
>>
>> 11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message
>>
>> Error: java.lang.ClassNotFoundException:
>>org.jboss.netty.channel.SimpleChannelHandler
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>
>> at java.lang.ClassLoader.defineClass1(Native Method)
>>
>> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>>
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>>
>> at 
>>java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>>
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>>
>> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>>
>> at 
>>org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(Monito
>>rAgentHarness.java:229)
>>
>> at 
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:6
>>1)
>>
>> at 
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:7
>>2)
>>
>> at 
>>org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:3
>>7)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>>rators.POUserFunc.getNext(POUserFunc.java:216)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>>rators.POUserFunc.getNext(POUserFunc.java:258)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.getNext(PhysicalOperator.java:316)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.processPlan(POForEach.java:332)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.getNext(POForEach.java:284)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.processInput(PhysicalOperator.java:290)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POFilter.getNext(POFilter.java:95)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOpera
>>tor.processInput(PhysicalOperator.java:290)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOpe
>>rators.POForEach.getNext(POForEach.java:233)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.runPipeline(PigGenericMapBase.java:267)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.map(PigGenericMapBase.java:262)
>>
>> at 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMa
>>pBase.map(PigGenericMapBase.java:64)
>>
>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>>
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>
>> Should I be running penny in a different way?
>>
>> Thanks,
>> Doug
>>
>>
>>
>>


Re: Exception running penny data sampler

Posted by Benjamin Reed <br...@apache.org>.
the problem is that the netty classes need to be accessible to the
tasks running in hadoop. i think the netty classes should be jarred
into the penny.jar, so that they are distributed properly. unless
someone else has a better idea.

ben

On Tue, Jul 26, 2011 at 10:02 AM, Doug Daniels <dd...@mortardata.com> wrote:
> Hi,
>
> I'm trying to run the data sampler tool from the penny library, and am getting a ClassNotFoundException for a netty class.  I'm using the trunk version of pig, with the patch from PIG-2013 applied.
>
> I'm running a simple script that uses pig test data from test/org/apache/pig/test/data/InputFiles/jsTst1.txt :
>
>    x = LOAD 'jsTst1.txt' USING PigStorage('\t');
>    x_filtered = FILTER x BY (int)$1 > 100;
>    STORE x_filtered INTO 'jsTst1Filtered';
>
> To run it, I tried the syntax from https://cwiki.apache.org/confluence/display/PIG/PennyToolLibrary, but I was getting a ClassNotFoundException on org.jboss.netty.channel.ChannelFactory before the job even started running.  I added the netty-3.2.2.Final.jar from pig's ivy libs to the -cp list, which fixed that ClassNotFoundException, but left me with a new one after the job started:
>
>
> 11/07/26 16:44:13 WARN mapReduceLayer.Launcher: There is no log file to write to.
>
> 11/07/26 16:44:13 ERROR mapReduceLayer.Launcher: Backend error message
>
> Error: java.lang.ClassNotFoundException: org.jboss.netty.channel.SimpleChannelHandler
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> at java.lang.ClassLoader.defineClass1(Native Method)
>
> at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
>
> at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
>
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>
> at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>
> at org.apache.pig.penny.impl.harnesses.MonitorAgentHarness.initialize(MonitorAgentHarness.java:229)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.init(MonitorAgentUDF.java:61)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:72)
>
> at org.apache.pig.penny.impl.pig.MonitorAgentUDF.exec(MonitorAgentUDF.java:37)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:258)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:95)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
>
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
>
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> Should I be running penny in a different way?
>
> Thanks,
> Doug
>
>
>
>