You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Zhengguo 'Mike' SUN <zh...@yahoo.com> on 2011/02/22 20:51:30 UTC

Fw: ClassNotFoundException when use a class as the value of SequenceFile

 
----- Forwarded Message -----
From: Zhengguo 'Mike' SUN <zh...@yahoo.com>
To: mahout-user <us...@mahout.apache.org>
Cc: mahout-dev <ma...@lucene.apache.org>
Sent: Monday, February 21, 2011 11:20 AM
Subject: LanczosSolver and ClassNotFoundException

Hi All,

I was playing with the LanczosSolver class in Mahout. What I did is copying the code in TestDistributedLanczosSolver.java and trying to run it in a shared cluster. I also packaged core, core-test, math, math-test, and mahout-collection 5 jars under the lib/ directory of my own jar. This new jar worked correctly on my local machine under Hadoop's local mode. When I submitted it to the cluster, I got ClassNotFoundException when running the TimesSquaredJob. The stack trace is as follow:

Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:866)
at org.apache.hadoop.io.WritableName.getClass(WritableName.java:71)
at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:1613)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1555)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
at org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43)
at org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

I also wrote a simple MapReduce job to test if I can access the Vector class with some naive code like the following:

Vector v = new DenseVector(100);
v.assign(3.14);

This job worked fine in the cluster. Thus, it seemed that it is not the problem to reference the Vector class. What could be wrong if it is not a dependence problem?