You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mark <st...@gmail.com> on 2011/06/09 02:47:05 UTC

Problems running distributed seq2sparse

Hello all,

I am trying to run seq2sparse as follow:

bin/mahout seq2sparse \
	-i clustering/items-seq \
	-o clustering/items-vectors \
	-wt tfidf \	
	-nr 3 \
	-ng 3 \
	-s 5 \
	-md 3 \
	-x  90 \
	-ml 50 \
	-ow

The first task is failing with the following error:

Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop

HADOOP_CONF_DIR=/etc/hadoop/conf
11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum n-gram size is: 3
11/06/08 17:39:13 INFO common.HadoopUtil: Deleting clustering/items-vectors
11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum LLR value: 50.0
11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of reduce tasks: 3
11/06/08 17:39:13 INFO input.FileInputFormat: Total input paths to process : 1
11/06/08 17:39:13 INFO mapred.JobClient: Running job: job_201106061352_0055
11/06/08 17:39:14 INFO mapred.JobClient:  map 0% reduce 0%
11/06/08 17:39:18 INFO mapred.JobClient: Task Id : attempt_201106061352_0055_m_000000_0, Status : FAILED
Error: Cannot inherit from final class
11/06/08 17:39:23 INFO mapred.JobClient: Task Id : attempt_201106061352_0055_m_000000_1, Status : FAILED
Error: Cannot inherit from final class
11/06/08 17:39:26 INFO mapred.JobClient: Task Id : attempt_201106061352_0055_m_000000_2, Status : FAILED
Error: Cannot inherit from final class
11/06/08 17:39:31 INFO mapred.JobClient: Job complete: job_201


The logs show:

*_syslog logs_*

2011-06-08 17:39:16,900 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2011-06-08 17:39:17,097 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2011-06-08 17:39:17,372 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2011-06-08 17:39:17,380 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.VerifyError: Cannot inherit from final class
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
	at org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:57)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


I am running mahout-0.5 src... just downloaded a fresh copy and ran mvn 
package.

  I tried the same using the distribution package but when I run that 
hadoop complains about missing jar.. ie lucene and google preconditions 
(wtf?)

Is there something I am doing wrong or is this a possible bug?

Here are my system stats... notice I am running Cloudera 0.20.2

Fedora Core 9
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)

Hadoop 0.20.2-cdh3u0
Subversion  -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14
Compiled by root on Fri Mar 25 20:07:24 EDT 2011
 From source with checksum 6c1f62dddc4eac69b6b973c18bbc0f55



Re: Problems running distributed seq2sparse

Posted by Mark <st...@gmail.com>.
Ahhh conflicting jars. That may be case as I just checked my nodes and 
noticed they have mahout-math and mahout-examples snapshot job jars in 
their lib directory.

Ill remove those and see if the problem still exists.

Thanks

On 6/9/11 12:21 AM, Sean Owen wrote:
> VerifyError indicates that compiled code from two different builds are being
> mixed somehow. It's saying that a situation which doesn't compile exists in
> the the compiled source.
>
> Is your CLASSPATH set to anything including Mahout jars? That's the first
> possibility that comes to mind.
>
> Also, are you using the "job" jar files? these include all dependencies. The
> plain "jar" files just have Mahout classes and won't work as Hadoop jar
> files.
>
> On Thu, Jun 9, 2011 at 1:47 AM, Mark<st...@gmail.com>  wrote:
>
>> Hello all,
>>
>> I am trying to run seq2sparse as follow:
>>
>> bin/mahout seq2sparse \
>>         -i clustering/items-seq \
>>         -o clustering/items-vectors \
>>         -wt tfidf \
>>         -nr 3 \
>>         -ng 3 \
>>         -s 5 \
>>         -md 3 \
>>         -x  90 \
>>         -ml 50 \
>>         -ow
>>
>> The first task is failing with the following error:
>>
>> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop
>>
>> HADOOP_CONF_DIR=/etc/hadoop/conf
>> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
>> n-gram size is: 3
>> 11/06/08 17:39:13 INFO common.HadoopUtil: Deleting clustering/items-vectors
>> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
>> LLR value: 50.0
>> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
>> reduce tasks: 3
>> 11/06/08 17:39:13 INFO input.FileInputFormat: Total input paths to process
>> : 1
>> 11/06/08 17:39:13 INFO mapred.JobClient: Running job: job_201106061352_0055
>> 11/06/08 17:39:14 INFO mapred.JobClient:  map 0% reduce 0%
>> 11/06/08 17:39:18 INFO mapred.JobClient: Task Id :
>> attempt_201106061352_0055_m_000000_0, Status : FAILED
>> Error: Cannot inherit from final class
>> 11/06/08 17:39:23 INFO mapred.JobClient: Task Id :
>> attempt_201106061352_0055_m_000000_1, Status : FAILED
>> Error: Cannot inherit from final class
>> 11/06/08 17:39:26 INFO mapred.JobClient: Task Id :
>> attempt_201106061352_0055_m_000000_2, Status : FAILED
>> Error: Cannot inherit from final class
>> 11/06/08 17:39:31 INFO mapred.JobClient: Job complete: job_201
>>
>>
>> The logs show:
>>
>> *_syslog logs_*
>>
>> 2011-06-08 17:39:16,900 WARN org.apache.hadoop.util.NativeCodeLoader:
>> Unable to load native-hadoop library for your platform... using builtin-java
>> classes where applicable
>> 2011-06-08 17:39:17,097 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=MAP, sessionId=
>> 2011-06-08 17:39:17,372 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
>> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>> 2011-06-08 17:39:17,380 FATAL org.apache.hadoop.mapred.Child: Error running
>> child : java.lang.VerifyError: Cannot inherit from final class
>>         at java.lang.ClassLoader.defineClass1(Native Method)
>>         at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
>>         at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
>>         at
>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>>         at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>         at
>> org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:57)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>
>>
>> I am running mahout-0.5 src... just downloaded a fresh copy and ran mvn
>> package.
>>
>>   I tried the same using the distribution package but when I run that hadoop
>> complains about missing jar.. ie lucene and google preconditions (wtf?)
>>
>> Is there something I am doing wrong or is this a possible bug?
>>
>> Here are my system stats... notice I am running Cloudera 0.20.2
>>
>> Fedora Core 9
>> java version "1.6.0_24"
>> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
>> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>>
>> Hadoop 0.20.2-cdh3u0
>> Subversion  -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14
>> Compiled by root on Fri Mar 25 20:07:24 EDT 2011
>>  From source with checksum 6c1f62dddc4eac69b6b973c18bbc0f55
>>
>>
>>

Re: Problems running distributed seq2sparse

Posted by Sean Owen <sr...@gmail.com>.
VerifyError indicates that compiled code from two different builds are being
mixed somehow. It's saying that a situation which doesn't compile exists in
the the compiled source.

Is your CLASSPATH set to anything including Mahout jars? That's the first
possibility that comes to mind.

Also, are you using the "job" jar files? these include all dependencies. The
plain "jar" files just have Mahout classes and won't work as Hadoop jar
files.

On Thu, Jun 9, 2011 at 1:47 AM, Mark <st...@gmail.com> wrote:

> Hello all,
>
> I am trying to run seq2sparse as follow:
>
> bin/mahout seq2sparse \
>        -i clustering/items-seq \
>        -o clustering/items-vectors \
>        -wt tfidf \
>        -nr 3 \
>        -ng 3 \
>        -s 5 \
>        -md 3 \
>        -x  90 \
>        -ml 50 \
>        -ow
>
> The first task is failing with the following error:
>
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop
>
> HADOOP_CONF_DIR=/etc/hadoop/conf
> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum
> n-gram size is: 3
> 11/06/08 17:39:13 INFO common.HadoopUtil: Deleting clustering/items-vectors
> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum
> LLR value: 50.0
> 11/06/08 17:39:13 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of
> reduce tasks: 3
> 11/06/08 17:39:13 INFO input.FileInputFormat: Total input paths to process
> : 1
> 11/06/08 17:39:13 INFO mapred.JobClient: Running job: job_201106061352_0055
> 11/06/08 17:39:14 INFO mapred.JobClient:  map 0% reduce 0%
> 11/06/08 17:39:18 INFO mapred.JobClient: Task Id :
> attempt_201106061352_0055_m_000000_0, Status : FAILED
> Error: Cannot inherit from final class
> 11/06/08 17:39:23 INFO mapred.JobClient: Task Id :
> attempt_201106061352_0055_m_000000_1, Status : FAILED
> Error: Cannot inherit from final class
> 11/06/08 17:39:26 INFO mapred.JobClient: Task Id :
> attempt_201106061352_0055_m_000000_2, Status : FAILED
> Error: Cannot inherit from final class
> 11/06/08 17:39:31 INFO mapred.JobClient: Job complete: job_201
>
>
> The logs show:
>
> *_syslog logs_*
>
> 2011-06-08 17:39:16,900 WARN org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using builtin-java
> classes where applicable
> 2011-06-08 17:39:17,097 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId=
> 2011-06-08 17:39:17,372 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2011-06-08 17:39:17,380 FATAL org.apache.hadoop.mapred.Child: Error running
> child : java.lang.VerifyError: Cannot inherit from final class
>        at java.lang.ClassLoader.defineClass1(Native Method)
>        at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
>        at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
>        at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
>        at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
>        at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        at
> org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:57)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>        at org.apache.hadoop.mapred.Child.main(Child.java:262)
>
>
> I am running mahout-0.5 src... just downloaded a fresh copy and ran mvn
> package.
>
>  I tried the same using the distribution package but when I run that hadoop
> complains about missing jar.. ie lucene and google preconditions (wtf?)
>
> Is there something I am doing wrong or is this a possible bug?
>
> Here are my system stats... notice I am running Cloudera 0.20.2
>
> Fedora Core 9
> java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>
> Hadoop 0.20.2-cdh3u0
> Subversion  -r 81256ad0f2e4ab2bd34b04f53d25a6c23686dd14
> Compiled by root on Fri Mar 25 20:07:24 EDT 2011
> From source with checksum 6c1f62dddc4eac69b6b973c18bbc0f55
>
>
>