You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mark <st...@gmail.com> on 2010/09/11 03:44:26 UTC

Classpath question

  Perhaps this a better place to post this (Originally posted to Hadoop)


If I submit a jar that has a lib directory that contains a bunch of 
jars, shouldn't those jars be in the classpath and available to all nodes?

The reason I ask this is because I am trying to submit a jar myjar.jar 
that has the following structure

--src
  \.... (My source classes)
-- lib
   \
    -- mahout-collections-0.3.jar
    -- mahout-core-0.3.jar
    -- mahout-math-0.3.jar
    -- hbase-0.20.0.jar
    -- commons-cli-2.0-mahout.jar

Now the job I am trying to run is actually part of mahout-core-0.3... 
not src. The job starts but then fails with the following error

10/09/10 03:15:13 INFO mapred.JobClient: Task Id : 
attempt_201009100306_0003_r_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
     at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
     at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:550)
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
     at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
     at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
     at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
     at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
     at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
     ... 3 more
Caused by: java.lang.NoClassDefFoundError: 
org/apache/mahout/math/map/OpenObjectIntHashMap
     at 
org.apache.mahout.fpm.pfpgrowth.ParallelFPGrowthReducer.<init>(ParallelFPGrowthReducer.java:56)
     ... 8 more


Apparently it cant find the 
org/apache/mahout/math/map/OpenObjectIntHashMap.class although that 
class is definitely in mahout-collections-0.3.jar. Is this a problem 
because the mahout-core-0.3.jar doesn't have a lib directory?

What is an easy way around this?

Thanks





Re: Classpath question

Posted by Drew Farris <dr...@apache.org>.
Hi Mark,

I've found that jobs loaded as classes contained in jar files within
the lib directory of a job jar have issues loading classes from jars
also in the lib directory. For example, I create a job jar that
includes all of the mahout (and dependency) jars in lib and execute
org.apache.mahout.classifier.bayes.TrainClassifier (from mahout-core
jar) and run into ClassNotFound exceptions for TokenStream, which
happens to be contained within the lucene jars in the job jar's lib
directory.

(command-line: hadoop jar target/mahout-drew-1.0-SNAPSHOT-job.jar
org.apache.mahout.classifier.bayes.TrainClassifier ..related args..)

I've work around the problem by unrolling the mahout-core jar and
placing the those classes at the top level of my hand-rolled job jar.
You'll see that the mahout-examples-VERSION.job provided with Mahout
does the same. It is generally easier to use the mahout-examples job
file or mahout command-line utility to run Mahout jobs.

In your example, I also noticed you include your classes within a
subdirectory named 'src' which may also cause problems unless that
indeed is the name of your top level package directory. You can see
that the mahout-examples-VERSION.job file includes the class package
tree at the top level of the job file.

Although your specific problem doesn't seem to be related to 0.3, you
should consider checking out the mahout sources from trunk and build
from there. There have been a number of fixes and improvements since
0.3 - See: https://cwiki.apache.org/confluence/display/MAHOUT/BuildingMahout
for instructions,

HTH,

Drew

On Fri, Sep 10, 2010 at 9:44 PM, Mark <st...@gmail.com> wrote:
>  Perhaps this a better place to post this (Originally posted to Hadoop)
>
>
> If I submit a jar that has a lib directory that contains a bunch of jars,
> shouldn't those jars be in the classpath and available to all nodes?
>
> The reason I ask this is because I am trying to submit a jar myjar.jar that
> has the following structure
>
> --src
>  \.... (My source classes)
> -- lib
>  \
>   -- mahout-collections-0.3.jar
>   -- mahout-core-0.3.jar
>   -- mahout-math-0.3.jar
>   -- hbase-0.20.0.jar
>   -- commons-cli-2.0-mahout.jar
>
> Now the job I am trying to run is actually part of mahout-core-0.3... not
> src. The job starts but then fails with the following error
>
> 10/09/10 03:15:13 INFO mapred.JobClient: Task Id :
> attempt_201009100306_0003_r_000000_0, Status : FAILED
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
>    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:550)
>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.reflect.InvocationTargetException
>    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>    at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>    at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>    at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
>    ... 3 more
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/mahout/math/map/OpenObjectIntHashMap
>    at
> org.apache.mahout.fpm.pfpgrowth.ParallelFPGrowthReducer.<init>(ParallelFPGrowthReducer.java:56)
>    ... 8 more
>
>
> Apparently it cant find the
> org/apache/mahout/math/map/OpenObjectIntHashMap.class although that class is
> definitely in mahout-collections-0.3.jar. Is this a problem because the
> mahout-core-0.3.jar doesn't have a lib directory?
>
> What is an easy way around this?
>
> Thanks
>
>
>
>
>

Re: Classpath question

Posted by Sean Owen <sr...@gmail.com>.
IIRC this packaging mechanism only works if you add "Class-Path" entries to
META-INF/MANIFEST.MF that specify the location of the .jar within the
archive.

I prefer to simply re-package all dependencies together, like Mahout does.

On Sat, Sep 11, 2010 at 2:44 AM, Mark <st...@gmail.com> wrote:

>  Perhaps this a better place to post this (Originally posted to Hadoop)
>
>
> If I submit a jar that has a lib directory that contains a bunch of jars,
> shouldn't those jars be in the classpath and available to all nodes?
>
> The reason I ask this is because I am trying to submit a jar myjar.jar that
> has the following structure
>
> --src
>  \.... (My source classes)
> -- lib
>  \
>   -- mahout-collections-0.3.jar
>   -- mahout-core-0.3.jar
>   -- mahout-math-0.3.jar
>   -- hbase-0.20.0.jar
>   -- commons-cli-2.0-mahout.jar
>
> Now the job I am trying to run is actually part of mahout-core-0.3... not
> src. The job starts but then fails with the following error
>
> 10/09/10 03:15:13 INFO mapred.JobClient: Task Id :
> attempt_201009100306_0003_r_000000_0, Status : FAILED
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:115)
>    at
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:550)
>    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.reflect.InvocationTargetException
>    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>    at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>    at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>    at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
>    at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:113)
>    ... 3 more
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/mahout/math/map/OpenObjectIntHashMap
>    at
> org.apache.mahout.fpm.pfpgrowth.ParallelFPGrowthReducer.<init>(ParallelFPGrowthReducer.java:56)
>    ... 8 more
>
>
> Apparently it cant find the
> org/apache/mahout/math/map/OpenObjectIntHashMap.class although that class is
> definitely in mahout-collections-0.3.jar. Is this a problem because the
> mahout-core-0.3.jar doesn't have a lib directory?
>
> What is an easy way around this?
>
> Thanks
>
>
>
>
>

Re: Classpath question

Posted by Ted Dunning <te...@gmail.com>.
Should?

or

Is?

The answer to the should question is possibly.

The answer to the is question is no.

This behavior is the reason for the jar-with-dependencies maven assembly
that is built in.  Very handy for this problem.

On Fri, Sep 10, 2010 at 6:44 PM, Mark <st...@gmail.com> wrote:

> If I submit a jar that has a lib directory that contains a bunch of jars,
> shouldn't those jars be in the classpath and available to all nodes?
>
> The reason I ask this is because I am trying to submit a jar myjar.jar that
> has the following structure
>