You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by DIPESH KUMAR SINGH <di...@gmail.com> on 2011/11/19 04:54:11 UTC

Error in executing mahout kmeans

Hi,

I was trying to execute sample kmeans in mahout on reuters dataset to get
myself started with mahout. After creating the sequence files, i got the
following error.

I am able to execute other map-reduce programs like wordcount on my hadoop
cluster.

I am unable to figure how to include these missing classes which are
indicated in exception. Please help.



$ bin/mahout seqdirectory \
        -i mahout-work/reuters-out \
        -o mahout-work/reuters-out-seqdir \
        -c UTF-8 -chunk 5

[hadoop@01HW394491 mahout-distribution-0.5]$ bin/mahout seq2sparse -i
mahout-work/reuters-out-seqdir/ -o
mahout-work/reuters-out-seqdir-sparse-kmeans
Running on hadoop, using
HADOOP_HOME=/home/hadoop/Desktop/Cloudera/hadoop-0.20.2-cdh3u1



HADOOP_CONF_DIR=/home/hadoop/Desktop/Cloudera/hadoop-0.20.2-cdh3u1/conf
11/11/17 19:04:55 WARN driver.MahoutDriver: No seq2sparse.props found
on classpath, will use command-line arguments only
11/11/17 19:04:55 INFO vectorizer.SparseVectorsFromSequenceFiles:
Maximum n-gram size is: 1
11/11/17 19:04:55 INFO vectorizer.SparseVectorsFromSequenceFiles:
Minimum LLR value: 1.0
11/11/17 19:04:55 INFO vectorizer.SparseVectorsFromSequenceFiles:
Number of reduce tasks: 1
11/11/17 19:04:55 INFO common.HadoopUtil: Deleting
mahout-work/reuters-out-seqdir-sparse-kmeans
11/11/17 19:04:55 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:04:55 INFO hdfs.DFSClient: Abandoning block
blk_-6964269590814813644_1827
11/11/17 19:04:55 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:04:56 INFO input.FileInputFormat: Total input paths to process : 1
11/11/17 19:04:56 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:04:56 INFO hdfs.DFSClient: Abandoning block
blk_-4087868027442600259_1828
11/11/17 19:04:56 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:04:56 INFO mapred.JobClient: Running job: job_201111161101_0055
11/11/17 19:04:57 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:05:02 INFO mapred.JobClient: Task Id :
attempt_201111161101_0055_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.lucene.analysis.Analyzer
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:637)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:58)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:02 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:05:02 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:05:04 INFO mapred.JobClient: Task Id :
attempt_201111161101_0055_m_000000_1, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.lucene.analysis.Analyzer
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:637)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:58)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

attempt_201111161101_0055_m_000000_1: log4j:WARN No appenders could be
found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201111161101_0055_m_000000_1: log4j:WARN Please initialize the
log4j system properly.
11/11/17 19:05:07 INFO mapred.JobClient: Task Id :
attempt_201111161101_0055_m_000000_2, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.lucene.analysis.Analyzer
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:637)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
	at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.setup(SequenceFileTokenizerMapper.java:58)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

attempt_201111161101_0055_m_000000_2: log4j:WARN No appenders could be
found for logger (org.apache.hadoop.hdfs.DFSClient).
attempt_201111161101_0055_m_000000_2: log4j:WARN Please initialize the
log4j system properly.
11/11/17 19:05:11 INFO mapred.JobClient: Job complete: job_201111161101_0055
11/11/17 19:05:11 INFO mapred.JobClient: Counters: 8
11/11/17 19:05:11 INFO mapred.JobClient:   Job Counters
11/11/17 19:05:11 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6453
11/11/17 19:05:11 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:05:11 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:05:11 INFO mapred.JobClient:     Rack-local map tasks=3
11/11/17 19:05:11 INFO mapred.JobClient:     Launched map tasks=4
11/11/17 19:05:11 INFO mapred.JobClient:     Data-local map tasks=1
11/11/17 19:05:11 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
11/11/17 19:05:11 INFO mapred.JobClient:     Failed map tasks=1
11/11/17 19:05:11 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.net.NoRouteToHostException: No route to
host
11/11/17 19:05:11 INFO hdfs.DFSClient: Abandoning block
blk_6421670782557855121_1838
11/11/17 19:05:11 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:11 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:05:11 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:11 INFO hdfs.DFSClient: Abandoning block
blk_7456944308367440338_1839
11/11/17 19:05:11 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:11 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:11 INFO hdfs.DFSClient: Abandoning block
blk_-153916846612015236_1840
11/11/17 19:05:11 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:11 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:11 INFO hdfs.DFSClient: Abandoning block
blk_-7727883400960382563_1841
11/11/17 19:05:11 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:11 INFO mapred.JobClient: Running job: job_201111161101_0056
11/11/17 19:05:12 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:05:17 INFO mapred.JobClient:  map 0% reduce 100%
11/11/17 19:05:17 INFO mapred.JobClient: Job complete: job_201111161101_0056
11/11/17 19:05:17 INFO mapred.JobClient: Counters: 14
11/11/17 19:05:17 INFO mapred.JobClient:   Job Counters
11/11/17 19:05:17 INFO mapred.JobClient:     Launched reduce tasks=1
11/11/17 19:05:17 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2188
11/11/17 19:05:17 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:05:17 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:05:17 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3008
11/11/17 19:05:17 INFO mapred.JobClient:   FileSystemCounters
11/11/17 19:05:17 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=48252
11/11/17 19:05:17 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=86
11/11/17 19:05:17 INFO mapred.JobClient:   Map-Reduce Framework
11/11/17 19:05:17 INFO mapred.JobClient:     Reduce input groups=0
11/11/17 19:05:17 INFO mapred.JobClient:     Combine output records=0
11/11/17 19:05:17 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/11/17 19:05:17 INFO mapred.JobClient:     Reduce output records=0
11/11/17 19:05:17 INFO mapred.JobClient:     Spilled Records=0
11/11/17 19:05:17 INFO mapred.JobClient:     Combine input records=0
11/11/17 19:05:17 INFO mapred.JobClient:     Reduce input records=0
11/11/17 19:05:18 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:05:18 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:18 INFO hdfs.DFSClient: Abandoning block
blk_6058238216343180279_1849
11/11/17 19:05:18 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:18 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.net.NoRouteToHostException: No route to
host
11/11/17 19:05:18 INFO hdfs.DFSClient: Abandoning block
blk_-4652905215564225407_1850
11/11/17 19:05:18 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:18 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:18 INFO hdfs.DFSClient: Abandoning block
blk_8404535034292914296_1851
11/11/17 19:05:18 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:18 INFO mapred.JobClient: Running job: job_201111161101_0057
11/11/17 19:05:19 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:05:24 INFO mapred.JobClient: Task Id :
attempt_201111161101_0057_r_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.mapreduce.JobContext.getReducerClass(JobContext.java:236)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:556)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:27 INFO mapred.JobClient: Task Id :
attempt_201111161101_0057_r_000000_1, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.mapreduce.JobContext.getReducerClass(JobContext.java:236)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:556)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:31 INFO mapred.JobClient: Task Id :
attempt_201111161101_0057_r_000000_2, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.mapreduce.JobContext.getReducerClass(JobContext.java:236)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:556)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:36 INFO mapred.JobClient: Job complete: job_201111161101_0057
11/11/17 19:05:36 INFO mapred.JobClient: Counters: 6
11/11/17 19:05:36 INFO mapred.JobClient:   Job Counters
11/11/17 19:05:36 INFO mapred.JobClient:     Launched reduce tasks=4
11/11/17 19:05:36 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=1970
11/11/17 19:05:36 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:05:36 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:05:36 INFO mapred.JobClient:     Failed reduce tasks=1
11/11/17 19:05:36 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=4455
11/11/17 19:05:36 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:05:36 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:36 INFO hdfs.DFSClient: Abandoning block
blk_1521784863216133803_1856
11/11/17 19:05:36 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:36 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:36 INFO hdfs.DFSClient: Abandoning block
blk_2966156400158038888_1857
11/11/17 19:05:36 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:36 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.net.NoRouteToHostException: No route to
host
11/11/17 19:05:36 INFO hdfs.DFSClient: Abandoning block
blk_-1789415449512488017_1858
11/11/17 19:05:36 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:36 INFO mapred.JobClient: Running job: job_201111161101_0058
11/11/17 19:05:37 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:05:42 INFO mapred.JobClient: Task Id :
attempt_201111161101_0058_r_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:47 INFO mapred.JobClient: Task Id :
attempt_201111161101_0058_r_000000_1, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:47 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:05:47 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:05:51 INFO mapred.JobClient: Task Id :
attempt_201111161101_0058_r_000000_2, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:05:55 INFO mapred.JobClient: Job complete: job_201111161101_0058
11/11/17 19:05:55 INFO mapred.JobClient: Counters: 6
11/11/17 19:05:55 INFO mapred.JobClient:   Job Counters
11/11/17 19:05:55 INFO mapred.JobClient:     Launched reduce tasks=4
11/11/17 19:05:55 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2307
11/11/17 19:05:55 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:05:55 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:05:55 INFO mapred.JobClient:     Failed reduce tasks=1
11/11/17 19:05:55 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=4748
11/11/17 19:05:55 INFO common.HadoopUtil: Deleting
mahout-work/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
11/11/17 19:05:55 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:05:55 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:05:55 INFO hdfs.DFSClient: Abandoning block
blk_5223676092912060078_1863
11/11/17 19:05:55 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:05:55 INFO mapred.JobClient: Running job: job_201111161101_0059
11/11/17 19:05:56 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:06:01 INFO mapred.JobClient:  map 0% reduce 100%
11/11/17 19:06:01 INFO mapred.JobClient: Job complete: job_201111161101_0059
11/11/17 19:06:01 INFO mapred.JobClient: Counters: 14
11/11/17 19:06:01 INFO mapred.JobClient:   Job Counters
11/11/17 19:06:01 INFO mapred.JobClient:     Launched reduce tasks=1
11/11/17 19:06:01 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2172
11/11/17 19:06:01 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:06:01 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:06:01 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=3226
11/11/17 19:06:01 INFO mapred.JobClient:   FileSystemCounters
11/11/17 19:06:01 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=48107
11/11/17 19:06:01 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=93
11/11/17 19:06:01 INFO mapred.JobClient:   Map-Reduce Framework
11/11/17 19:06:01 INFO mapred.JobClient:     Reduce input groups=0
11/11/17 19:06:01 INFO mapred.JobClient:     Combine output records=0
11/11/17 19:06:01 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/11/17 19:06:01 INFO mapred.JobClient:     Reduce output records=0
11/11/17 19:06:01 INFO mapred.JobClient:     Spilled Records=0
11/11/17 19:06:01 INFO mapred.JobClient:     Combine input records=0
11/11/17 19:06:01 INFO mapred.JobClient:     Reduce input records=0
11/11/17 19:06:01 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.net.NoRouteToHostException: No route to
host
11/11/17 19:06:01 INFO hdfs.DFSClient: Abandoning block
blk_-285444083030904878_1872
11/11/17 19:06:01 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:06:01 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:06:01 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:06:01 INFO hdfs.DFSClient: Abandoning block
blk_6357897003952259140_1873
11/11/17 19:06:01 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:06:01 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:06:01 INFO hdfs.DFSClient: Abandoning block
blk_-5863602409852431971_1874
11/11/17 19:06:01 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:06:01 INFO mapred.JobClient: Running job: job_201111161101_0060
11/11/17 19:06:02 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:06:06 INFO mapred.JobClient: Task Id :
attempt_201111161101_0060_r_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:06:10 INFO mapred.JobClient: Task Id :
attempt_201111161101_0060_r_000000_1, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:06:14 INFO mapred.JobClient: Task Id :
attempt_201111161101_0060_r_000000_2, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:06:14 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:06:14 WARN mapred.JobClient: Error reading task outputNo
route to host
11/11/17 19:06:19 INFO mapred.JobClient: Job complete: job_201111161101_0060
11/11/17 19:06:19 INFO mapred.JobClient: Counters: 6
11/11/17 19:06:19 INFO mapred.JobClient:   Job Counters
11/11/17 19:06:19 INFO mapred.JobClient:     Launched reduce tasks=4
11/11/17 19:06:19 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=1953
11/11/17 19:06:19 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:06:19 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:06:19 INFO mapred.JobClient:     Failed reduce tasks=1
11/11/17 19:06:19 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=4537
11/11/17 19:06:20 INFO input.FileInputFormat: Total input paths to process : 0
11/11/17 19:06:20 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.io.IOException: Bad connect ack with
firstBadLink as 172.29.178.105:50010
11/11/17 19:06:20 INFO hdfs.DFSClient: Abandoning block
blk_2087335504677600310_1880
11/11/17 19:06:20 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:06:20 INFO hdfs.DFSClient: Exception in
createBlockOutputStream java.net.NoRouteToHostException: No route to
host
11/11/17 19:06:20 INFO hdfs.DFSClient: Abandoning block
blk_-5955264813512981601_1882
11/11/17 19:06:20 INFO hdfs.DFSClient: Excluding datanode 172.29.178.105:50010
11/11/17 19:06:20 INFO mapred.JobClient: Running job: job_201111161101_0061
11/11/17 19:06:21 INFO mapred.JobClient:  map 0% reduce 0%
11/11/17 19:06:26 INFO mapred.JobClient: Task Id :
attempt_201111161101_0061_r_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:06:30 INFO mapred.JobClient: Task Id :
attempt_201111161101_0061_r_000000_1, Status : FAILED
11/11/17 19:06:34 INFO mapred.JobClient: Task Id :
attempt_201111161101_0061_r_000000_2, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
	at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:264)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1020)
	at org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf.java:950)
	at org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobConf.java:750)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2360)
	at org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:582)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:395)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
	at org.apache.hadoop.mapred.Child.main(Child.java:264)

11/11/17 19:06:38 INFO mapred.JobClient: Job complete: job_201111161101_0061
11/11/17 19:06:38 INFO mapred.JobClient: Counters: 6
11/11/17 19:06:38 INFO mapred.JobClient:   Job Counters
11/11/17 19:06:38 INFO mapred.JobClient:     Launched reduce tasks=4
11/11/17 19:06:38 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2193
11/11/17 19:06:38 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
11/11/17 19:06:38 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
11/11/17 19:06:38 INFO mapred.JobClient:     Failed reduce tasks=1
11/11/17 19:06:38 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=4302
11/11/17 19:06:38 INFO common.HadoopUtil: Deleting
mahout-work/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
11/11/17 19:06:38 INFO driver.MahoutDriver: Program took 103338 ms
[hadoop@01HW394491 mahout-distribution-0.5]$





Thanks and Regards,

Dipesh

-- 
Dipesh Kr. Singh

Re: Error in executing mahout kmeans

Posted by Lance Norskog <go...@gmail.com>.
Is it possible that the Hadoop job jar mechanism is broken? Try disabling
the distributed Hadoop feature, and run "pseudo-distributed":
unset HADOOP_HOME
sh examples/bin/reuters-build.sh

On Tue, Nov 22, 2011 at 8:25 PM, Lance Norskog <go...@gmail.com> wrote:

> There is something wrong with how you are building the Mahout source. This
> is the sequence that should work:
>
> First, remove your Maven module download directory. Usually this is
> /home/dipesh/.m2. Your build will now download all of the dependencies. (It
> is not usually the problem, but it helps to do everything from the
> beginning.)
>
> export MAHOUT_HOME=/your/path/of/source/code
> cd $MAHOUT_HOME
> mvn clean install
> bin/mahout
>
> This should give you a list of the commands.
>
> Now run the reuters script.
>
>
> On Tue, Nov 22, 2011 at 1:36 PM, Isabel Drost <is...@apache.org> wrote:
>
>> **
>>
>> On 22.11.2011 DIPESH KUMAR SINGH wrote:
>>
>> > I ran the script and i was getting error regarding missing libraries.
>> The
>>
>> > error which i got is attached.
>>
>> > Then i tried executing the commands in the script, command by command,
>> and
>>
>> > i figured out that error was coming
>>
>> > in the seq2sparse step. (Prior to this step all the conversions are
>> working
>>
>> > fine)
>>
>>
>> There seem to be problems resolving some of the dependencies used - not
>> sure why though. You did compile the project and in that process created a
>> job jar?
>>
>>
>>
>> > What i exactly want to try is document clustering, i thought it is
>> better
>>
>> > to try first with Reuters dataset to get started.
>>
>> > Are the source files of kmeans (mapper and reducer etc) are there in
>> mahout
>>
>> > source folder?
>>
>>
>> Sure, look in the maven module core in the o.a.m.clustering package - all
>> kmeans related code is in there.
>>
>>
>> Isabel
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>
>


-- 
Lance Norskog
goksron@gmail.com

Re: Error in executing mahout kmeans

Posted by Lance Norskog <go...@gmail.com>.
There is something wrong with how you are building the Mahout source. This
is the sequence that should work:

First, remove your Maven module download directory. Usually this is
/home/dipesh/.m2. Your build will now download all of the dependencies. (It
is not usually the problem, but it helps to do everything from the
beginning.)

export MAHOUT_HOME=/your/path/of/source/code
cd $MAHOUT_HOME
mvn clean install
bin/mahout

This should give you a list of the commands.

Now run the reuters script.

On Tue, Nov 22, 2011 at 1:36 PM, Isabel Drost <is...@apache.org> wrote:

> **
>
> On 22.11.2011 DIPESH KUMAR SINGH wrote:
>
> > I ran the script and i was getting error regarding missing libraries. The
>
> > error which i got is attached.
>
> > Then i tried executing the commands in the script, command by command,
> and
>
> > i figured out that error was coming
>
> > in the seq2sparse step. (Prior to this step all the conversions are
> working
>
> > fine)
>
>
> There seem to be problems resolving some of the dependencies used - not
> sure why though. You did compile the project and in that process created a
> job jar?
>
>
>
> > What i exactly want to try is document clustering, i thought it is better
>
> > to try first with Reuters dataset to get started.
>
> > Are the source files of kmeans (mapper and reducer etc) are there in
> mahout
>
> > source folder?
>
>
> Sure, look in the maven module core in the o.a.m.clustering package - all
> kmeans related code is in there.
>
>
> Isabel
>



-- 
Lance Norskog
goksron@gmail.com

Re: Error in executing mahout kmeans

Posted by Isabel Drost <is...@apache.org>.
On 22.11.2011 DIPESH KUMAR SINGH wrote:
> I ran the script and i was getting error regarding missing libraries. The
> error which i got is attached.
> Then i tried executing the commands in the script, command by command, and
> i figured out that error was coming
> in the seq2sparse step. (Prior to this step all the conversions are working
> fine)

There seem to be problems resolving some of the dependencies used - not sure why 
though. You did compile the project and in that process created a job jar?


> What i exactly want to try is document clustering, i thought it is better
> to try first with Reuters dataset to get started.
> Are the source files of kmeans (mapper and reducer etc) are there in mahout
> source folder?

Sure, look in the maven module core in the o.a.m.clustering package - all kmeans 
related code is in there.

Isabel

Re: Error in executing mahout kmeans

Posted by DIPESH KUMAR SINGH <di...@gmail.com>.
Lance -

I ran the script and i was getting error regarding missing libraries. The
error which i got is attached.

Then i tried executing the commands in the script, command by command, and
i figured out that error was coming
in the seq2sparse step. (Prior to this step all the conversions are working
fine)

What i exactly want to try is document clustering, i thought it is better
to try first with Reuters dataset to get started.
Are the source files of kmeans (mapper and reducer etc) are there in mahout
source folder?

Thanks,

Dipesh


On Tue, Nov 22, 2011 at 8:50 AM, Lance Norskog <go...@gmail.com> wrote:

> Dipesh-
>
> To run the Reuters dataset, use examples/bin/build-reuters.sh. There are a
> lot of options and it is easier to see how it all works.
>
> DisplayKMeans is a standalone Swing program that shows a small fabricated
> set of points as an educational tool. It does not show your data.
>
> If you want to do that, there is an option to export clusters into a
> displayable format called 'graphml'. When you have your clusters created,
> run 'mahout clusterdump'. Use 'output format' of GML. There is a separate
> app called 'Gephi' that can read files in this format.
>
> On Mon, Nov 21, 2011 at 7:10 PM, DIPESH KUMAR SINGH
> <di...@gmail.com>wrote:
>
> > Mahout is installed as i can get some output on executing $mahout
> >
> > I am not sure all the libraries are build or not.
> >
> > Just want to know presence of which all files would let me know that
> mahout
> > is build? (or any folder etc)
> >
> >
> >
> >
> >
> > On Tue, Nov 22, 2011 at 1:16 AM, Dan Beaulieu
> > <da...@gmail.com>wrote:
> >
> > > Have you built mahout? You'll need to do that via:
> > > $mvn install
> > >
> > >
> > >
> > > On Monday, November 21, 2011, DIPESH KUMAR SINGH <
> dipesh.tech@gmail.com>
> > > wrote:
> > > > I am unable to figure out how to use .job file.
> > > >
> > > > Do i need to build the DisplayKmeans.java file, by compiling (javac)
> > and
> > > > making jars etc.?
> > > >
> > > > To get started, i was trying to just run kmeans example in mahout
> from
> > > CLI.
> > > >
> > > > I could make the hadoop sequence files in hdfs, but on running
> > > seq2sparse,
> > > > i am getting following 2 errors.
> > > >
> > > > (I was following this ppt :
> > > > Link<
> > >
> > >
> >
> http://assets.en.oreilly.com/1/event/61/Hands%20On%20Mahout%20-%20Mammoth%20Scale%20Machine%20Learning%20Presentation.ppt
> > > >)
> > > >
> > > > Error: java.lang.ClassNotFoundException:
> > > > org.apache.lucene.analysis.Analyzer
> > > > Error: java.lang.ClassNotFoundException:
> org.apache.mahout.math.Vector
> > > >
> > > > It would be great, if someone can guide me through the specific steps
> > and
> > > > help me get started.
> > > >
> > > > Forgive me for my basic questions, i am new to mahout.
> > > >
> > > > Thanks & Regards,
> > > >
> > > > Dipesh
> > > >
> > > > On Sat, Nov 19, 2011 at 2:48 PM, Sean Owen <sr...@gmail.com> wrote:
> > > >
> > > >> You are not using the .job file, which has all the dependencies that
> > you
> > > >> need to send to Hadoop. I think you need to build the project.
> > > >>
> > > >> On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
> > > >> <di...@gmail.com>wrote:
> > > >>
> > > >> > Hi,
> > > >> >
> > > >> > I was trying to execute sample kmeans in mahout on reuters dataset
> > to
> > > get
> > > >> > myself started with mahout. After creating the sequence files, i
> got
> > > the
> > > >> > following error.
> > > >> >
> > > >> > I am able to execute other map-reduce programs like wordcount on
> my
> > > >> hadoop
> > > >> > cluster.
> > > >> >
> > > >> > I am unable to figure how to include these missing classes which
> are
> > > >> > indicated in exception. Please help.
> > > >>  >
> > > >>
> > > >
> > > >
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Dipesh Kr. Singh
> > > >
> > >
> >
> >
> >
> > --
> > Dipesh Kr. Singh
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>



-- 
Dipesh Kr. Singh

Re: Error in executing mahout kmeans

Posted by Lance Norskog <go...@gmail.com>.
Dipesh-

To run the Reuters dataset, use examples/bin/build-reuters.sh. There are a
lot of options and it is easier to see how it all works.

DisplayKMeans is a standalone Swing program that shows a small fabricated
set of points as an educational tool. It does not show your data.

If you want to do that, there is an option to export clusters into a
displayable format called 'graphml'. When you have your clusters created,
run 'mahout clusterdump'. Use 'output format' of GML. There is a separate
app called 'Gephi' that can read files in this format.

On Mon, Nov 21, 2011 at 7:10 PM, DIPESH KUMAR SINGH
<di...@gmail.com>wrote:

> Mahout is installed as i can get some output on executing $mahout
>
> I am not sure all the libraries are build or not.
>
> Just want to know presence of which all files would let me know that mahout
> is build? (or any folder etc)
>
>
>
>
>
> On Tue, Nov 22, 2011 at 1:16 AM, Dan Beaulieu
> <da...@gmail.com>wrote:
>
> > Have you built mahout? You'll need to do that via:
> > $mvn install
> >
> >
> >
> > On Monday, November 21, 2011, DIPESH KUMAR SINGH <di...@gmail.com>
> > wrote:
> > > I am unable to figure out how to use .job file.
> > >
> > > Do i need to build the DisplayKmeans.java file, by compiling (javac)
> and
> > > making jars etc.?
> > >
> > > To get started, i was trying to just run kmeans example in mahout from
> > CLI.
> > >
> > > I could make the hadoop sequence files in hdfs, but on running
> > seq2sparse,
> > > i am getting following 2 errors.
> > >
> > > (I was following this ppt :
> > > Link<
> >
> >
> http://assets.en.oreilly.com/1/event/61/Hands%20On%20Mahout%20-%20Mammoth%20Scale%20Machine%20Learning%20Presentation.ppt
> > >)
> > >
> > > Error: java.lang.ClassNotFoundException:
> > > org.apache.lucene.analysis.Analyzer
> > > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
> > >
> > > It would be great, if someone can guide me through the specific steps
> and
> > > help me get started.
> > >
> > > Forgive me for my basic questions, i am new to mahout.
> > >
> > > Thanks & Regards,
> > >
> > > Dipesh
> > >
> > > On Sat, Nov 19, 2011 at 2:48 PM, Sean Owen <sr...@gmail.com> wrote:
> > >
> > >> You are not using the .job file, which has all the dependencies that
> you
> > >> need to send to Hadoop. I think you need to build the project.
> > >>
> > >> On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
> > >> <di...@gmail.com>wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I was trying to execute sample kmeans in mahout on reuters dataset
> to
> > get
> > >> > myself started with mahout. After creating the sequence files, i got
> > the
> > >> > following error.
> > >> >
> > >> > I am able to execute other map-reduce programs like wordcount on my
> > >> hadoop
> > >> > cluster.
> > >> >
> > >> > I am unable to figure how to include these missing classes which are
> > >> > indicated in exception. Please help.
> > >>  >
> > >>
> > >
> > >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Dipesh Kr. Singh
> > >
> >
>
>
>
> --
> Dipesh Kr. Singh
>



-- 
Lance Norskog
goksron@gmail.com

Re: Error in executing mahout kmeans

Posted by DIPESH KUMAR SINGH <di...@gmail.com>.
Mahout is installed as i can get some output on executing $mahout

I am not sure all the libraries are build or not.

Just want to know presence of which all files would let me know that mahout
is build? (or any folder etc)





On Tue, Nov 22, 2011 at 1:16 AM, Dan Beaulieu
<da...@gmail.com>wrote:

> Have you built mahout? You'll need to do that via:
> $mvn install
>
>
>
> On Monday, November 21, 2011, DIPESH KUMAR SINGH <di...@gmail.com>
> wrote:
> > I am unable to figure out how to use .job file.
> >
> > Do i need to build the DisplayKmeans.java file, by compiling (javac) and
> > making jars etc.?
> >
> > To get started, i was trying to just run kmeans example in mahout from
> CLI.
> >
> > I could make the hadoop sequence files in hdfs, but on running
> seq2sparse,
> > i am getting following 2 errors.
> >
> > (I was following this ppt :
> > Link<
>
> http://assets.en.oreilly.com/1/event/61/Hands%20On%20Mahout%20-%20Mammoth%20Scale%20Machine%20Learning%20Presentation.ppt
> >)
> >
> > Error: java.lang.ClassNotFoundException:
> > org.apache.lucene.analysis.Analyzer
> > Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
> >
> > It would be great, if someone can guide me through the specific steps and
> > help me get started.
> >
> > Forgive me for my basic questions, i am new to mahout.
> >
> > Thanks & Regards,
> >
> > Dipesh
> >
> > On Sat, Nov 19, 2011 at 2:48 PM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> You are not using the .job file, which has all the dependencies that you
> >> need to send to Hadoop. I think you need to build the project.
> >>
> >> On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
> >> <di...@gmail.com>wrote:
> >>
> >> > Hi,
> >> >
> >> > I was trying to execute sample kmeans in mahout on reuters dataset to
> get
> >> > myself started with mahout. After creating the sequence files, i got
> the
> >> > following error.
> >> >
> >> > I am able to execute other map-reduce programs like wordcount on my
> >> hadoop
> >> > cluster.
> >> >
> >> > I am unable to figure how to include these missing classes which are
> >> > indicated in exception. Please help.
> >>  >
> >>
> >
> >
> >> >
> >>
> >
> >
> >
> > --
> > Dipesh Kr. Singh
> >
>



-- 
Dipesh Kr. Singh

Re: Error in executing mahout kmeans

Posted by Dan Beaulieu <da...@gmail.com>.
Have you built mahout? You'll need to do that via:
$mvn install



On Monday, November 21, 2011, DIPESH KUMAR SINGH <di...@gmail.com>
wrote:
> I am unable to figure out how to use .job file.
>
> Do i need to build the DisplayKmeans.java file, by compiling (javac) and
> making jars etc.?
>
> To get started, i was trying to just run kmeans example in mahout from
CLI.
>
> I could make the hadoop sequence files in hdfs, but on running seq2sparse,
> i am getting following 2 errors.
>
> (I was following this ppt :
> Link<
http://assets.en.oreilly.com/1/event/61/Hands%20On%20Mahout%20-%20Mammoth%20Scale%20Machine%20Learning%20Presentation.ppt
>)
>
> Error: java.lang.ClassNotFoundException:
> org.apache.lucene.analysis.Analyzer
> Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
>
> It would be great, if someone can guide me through the specific steps and
> help me get started.
>
> Forgive me for my basic questions, i am new to mahout.
>
> Thanks & Regards,
>
> Dipesh
>
> On Sat, Nov 19, 2011 at 2:48 PM, Sean Owen <sr...@gmail.com> wrote:
>
>> You are not using the .job file, which has all the dependencies that you
>> need to send to Hadoop. I think you need to build the project.
>>
>> On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
>> <di...@gmail.com>wrote:
>>
>> > Hi,
>> >
>> > I was trying to execute sample kmeans in mahout on reuters dataset to
get
>> > myself started with mahout. After creating the sequence files, i got
the
>> > following error.
>> >
>> > I am able to execute other map-reduce programs like wordcount on my
>> hadoop
>> > cluster.
>> >
>> > I am unable to figure how to include these missing classes which are
>> > indicated in exception. Please help.
>>  >
>>
>
>
>> >
>>
>
>
>
> --
> Dipesh Kr. Singh
>

Re: Error in executing mahout kmeans

Posted by DIPESH KUMAR SINGH <di...@gmail.com>.
I am unable to figure out how to use .job file.

Do i need to build the DisplayKmeans.java file, by compiling (javac) and
making jars etc.?

To get started, i was trying to just run kmeans example in mahout from CLI.

I could make the hadoop sequence files in hdfs, but on running seq2sparse,
i am getting following 2 errors.

(I was following this ppt :
Link<http://assets.en.oreilly.com/1/event/61/Hands%20On%20Mahout%20-%20Mammoth%20Scale%20Machine%20Learning%20Presentation.ppt>)

Error: java.lang.ClassNotFoundException:
org.apache.lucene.analysis.Analyzer
Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector

It would be great, if someone can guide me through the specific steps and
help me get started.

Forgive me for my basic questions, i am new to mahout.

Thanks & Regards,

Dipesh

On Sat, Nov 19, 2011 at 2:48 PM, Sean Owen <sr...@gmail.com> wrote:

> You are not using the .job file, which has all the dependencies that you
> need to send to Hadoop. I think you need to build the project.
>
> On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
> <di...@gmail.com>wrote:
>
> > Hi,
> >
> > I was trying to execute sample kmeans in mahout on reuters dataset to get
> > myself started with mahout. After creating the sequence files, i got the
> > following error.
> >
> > I am able to execute other map-reduce programs like wordcount on my
> hadoop
> > cluster.
> >
> > I am unable to figure how to include these missing classes which are
> > indicated in exception. Please help.
>  >
>


> >
>



-- 
Dipesh Kr. Singh

Re: Error in executing mahout kmeans

Posted by Sean Owen <sr...@gmail.com>.
You are not using the .job file, which has all the dependencies that you
need to send to Hadoop. I think you need to build the project.

On Sat, Nov 19, 2011 at 3:54 AM, DIPESH KUMAR SINGH
<di...@gmail.com>wrote:

> Hi,
>
> I was trying to execute sample kmeans in mahout on reuters dataset to get
> myself started with mahout. After creating the sequence files, i got the
> following error.
>
> I am able to execute other map-reduce programs like wordcount on my hadoop
> cluster.
>
> I am unable to figure how to include these missing classes which are
> indicated in exception. Please help.
>
>