You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Apache Jenkins Server <je...@builds.apache.org> on 2012/03/16 20:32:44 UTC

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #73

See <https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/73/changes>

Changes:

[pranjan] MAHOUT-981, MAHOUT-983. Refactored K-Means Clustering and Dirichlet Clustering to use ClusterClassificationDriver. 
Using cluster.getModel().configure() in ClusterClassificationDriver in order to configure DirichletCluster for MahalanobisDistanceMeasure. 
Added/fixed test cases by:
Using separate directories in test cases for supplying initial clusters and to store buildClusters to prevent two cluster-*-final files in the same directory.
Writing IntWritable in test cases instead of LongWritable ( As the ClusterClassificationDriver clusters records with IntWritable keys).

------------------------------------------
[...truncated 6371 lines...]
12/03/16 19:32:11 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/16 19:32:11 INFO mapred.JobClient:     Reduce output records=21578
12/03/16 19:32:11 INFO mapred.JobClient:     Spilled Records=43156
12/03/16 19:32:11 INFO mapred.JobClient:     Map output bytes=17337483
12/03/16 19:32:11 INFO mapred.JobClient:     Combine input records=0
12/03/16 19:32:11 INFO mapred.JobClient:     Map output records=21578
12/03/16 19:32:11 INFO mapred.JobClient:     SPLIT_RAW_BYTES=151
12/03/16 19:32:11 INFO mapred.JobClient:     Reduce input records=21578
12/03/16 19:32:11 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/03/16 19:32:12 INFO input.FileInputFormat: Total input paths to process : 1
12/03/16 19:32:12 INFO mapred.JobClient: Running job: job_local_0007
12/03/16 19:32:12 INFO mapred.MapTask: io.sort.mb = 100
12/03/16 19:32:12 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/16 19:32:12 INFO mapred.MapTask: record buffer = 262144/327680
12/03/16 19:32:12 INFO mapred.MapTask: Starting flush of map output
12/03/16 19:32:12 INFO mapred.MapTask: Finished spill 0
12/03/16 19:32:12 INFO mapred.Task: Task:attempt_local_0007_m_000000_0 is done. And is in the process of commiting
12/03/16 19:32:13 INFO mapred.JobClient:  map 0% reduce 0%
12/03/16 19:32:15 INFO mapred.LocalJobRunner: 
12/03/16 19:32:15 INFO mapred.Task: Task 'attempt_local_0007_m_000000_0' done.
12/03/16 19:32:15 INFO mapred.LocalJobRunner: 
12/03/16 19:32:15 INFO mapred.Merger: Merging 1 sorted segments
12/03/16 19:32:15 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 4719200 bytes
12/03/16 19:32:15 INFO mapred.LocalJobRunner: 
12/03/16 19:32:15 INFO mapred.Task: Task:attempt_local_0007_r_000000_0 is done. And is in the process of commiting
12/03/16 19:32:15 INFO mapred.LocalJobRunner: 
12/03/16 19:32:15 INFO mapred.Task: Task attempt_local_0007_r_000000_0 is allowed to commit now
12/03/16 19:32:15 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0007_r_000000_0' to /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/03/16 19:32:16 INFO mapred.JobClient:  map 100% reduce 0%
12/03/16 19:32:18 INFO mapred.LocalJobRunner: reduce > reduce
12/03/16 19:32:18 INFO mapred.Task: Task 'attempt_local_0007_r_000000_0' done.
12/03/16 19:32:19 INFO mapred.JobClient:  map 100% reduce 100%
12/03/16 19:32:19 INFO mapred.JobClient: Job complete: job_local_0007
12/03/16 19:32:19 INFO mapred.JobClient: Counters: 16
12/03/16 19:32:19 INFO mapred.JobClient:   File Output Format Counters 
12/03/16 19:32:19 INFO mapred.JobClient:     Bytes Written=4914503
12/03/16 19:32:19 INFO mapred.JobClient:   FileSystemCounters
12/03/16 19:32:19 INFO mapred.JobClient:     FILE_BYTES_READ=685573082
12/03/16 19:32:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=595257121
12/03/16 19:32:19 INFO mapred.JobClient:   File Input Format Counters 
12/03/16 19:32:19 INFO mapred.JobClient:     Bytes Read=4914503
12/03/16 19:32:19 INFO mapred.JobClient:   Map-Reduce Framework
12/03/16 19:32:19 INFO mapred.JobClient:     Reduce input groups=21578
12/03/16 19:32:19 INFO mapred.JobClient:     Map output materialized bytes=4719204
12/03/16 19:32:19 INFO mapred.JobClient:     Combine output records=0
12/03/16 19:32:19 INFO mapred.JobClient:     Map input records=21578
12/03/16 19:32:19 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/16 19:32:19 INFO mapred.JobClient:     Reduce output records=21578
12/03/16 19:32:19 INFO mapred.JobClient:     Spilled Records=43156
12/03/16 19:32:19 INFO mapred.JobClient:     Map output bytes=4659281
12/03/16 19:32:19 INFO mapred.JobClient:     Combine input records=0
12/03/16 19:32:19 INFO mapred.JobClient:     Map output records=21578
12/03/16 19:32:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=158
12/03/16 19:32:19 INFO mapred.JobClient:     Reduce input records=21578
12/03/16 19:32:19 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/03/16 19:32:19 INFO driver.MahoutDriver: Program took 75870 ms (Minutes: 1.2645)
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
no HADOOP_HOME set, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
12/03/16 19:32:19 INFO common.AbstractJob: Command line arguments: {--clustering=null, --clusters=[/tmp/mahout-work-jenkins/reuters-kmeans-clusters], --convergenceDelta=[0.5], --distanceMeasure=[org.apache.mahout.common.distance.CosineDistanceMeasure], --endPhase=[2147483647], --input=[/tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/], --maxIter=[10], --method=[mapreduce], --numClusters=[20], --output=[/tmp/mahout-work-jenkins/reuters-kmeans], --overwrite=null, --startPhase=[0], --tempDir=[temp]}
12/03/16 19:32:20 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/reuters-kmeans
12/03/16 19:32:20 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/reuters-kmeans-clusters
12/03/16 19:32:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/03/16 19:32:20 INFO compress.CodecPool: Got brand-new compressor
12/03/16 19:32:21 INFO kmeans.RandomSeedGenerator: Wrote 20 vectors to /tmp/mahout-work-jenkins/reuters-kmeans-clusters/part-randomSeed
12/03/16 19:32:21 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-jenkins/reuters-kmeans-clusters/part-randomSeed Out: /tmp/mahout-work-jenkins/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure
12/03/16 19:32:21 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
12/03/16 19:32:21 INFO kmeans.KMeansDriver: K-Means Iteration 1
12/03/16 19:32:21 INFO input.FileInputFormat: Total input paths to process : 1
12/03/16 19:32:21 INFO mapred.JobClient: Running job: job_local_0001
12/03/16 19:32:21 INFO mapred.MapTask: io.sort.mb = 100
12/03/16 19:32:21 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/16 19:32:21 INFO mapred.MapTask: record buffer = 262144/327680
12/03/16 19:32:21 INFO compress.CodecPool: Got brand-new decompressor
12/03/16 19:32:22 INFO mapred.JobClient:  map 0% reduce 0%
12/03/16 19:32:23 INFO mapred.MapTask: Starting flush of map output
12/03/16 19:32:23 INFO mapred.MapTask: Finished spill 0
12/03/16 19:32:23 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/03/16 19:32:24 INFO mapred.LocalJobRunner: 
12/03/16 19:32:24 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/03/16 19:32:24 INFO mapred.LocalJobRunner: 
12/03/16 19:32:24 INFO mapred.Merger: Merging 1 sorted segments
12/03/16 19:32:24 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1860291 bytes
12/03/16 19:32:24 INFO mapred.LocalJobRunner: 
12/03/16 19:32:24 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/03/16 19:32:24 INFO mapred.LocalJobRunner: 
12/03/16 19:32:24 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/03/16 19:32:24 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to /tmp/mahout-work-jenkins/reuters-kmeans/clusters-1
12/03/16 19:32:25 INFO mapred.JobClient:  map 100% reduce 0%
12/03/16 19:32:27 INFO mapred.LocalJobRunner: reduce > reduce
12/03/16 19:32:27 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
12/03/16 19:32:28 INFO mapred.JobClient:  map 100% reduce 100%
12/03/16 19:32:28 INFO mapred.JobClient: Job complete: job_local_0001
12/03/16 19:32:28 INFO mapred.JobClient: Counters: 17
12/03/16 19:32:28 INFO mapred.JobClient:   File Output Format Counters 
12/03/16 19:32:28 INFO mapred.JobClient:     Bytes Written=1876461
12/03/16 19:32:28 INFO mapred.JobClient:   Clustering
12/03/16 19:32:28 INFO mapred.JobClient:     Converged Clusters=9
12/03/16 19:32:28 INFO mapred.JobClient:   FileSystemCounters
12/03/16 19:32:28 INFO mapred.JobClient:     FILE_BYTES_READ=69854141
12/03/16 19:32:28 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=54372783
12/03/16 19:32:28 INFO mapred.JobClient:   File Input Format Counters 
12/03/16 19:32:28 INFO mapred.JobClient:     Bytes Read=4914503
12/03/16 19:32:28 INFO mapred.JobClient:   Map-Reduce Framework
12/03/16 19:32:28 INFO mapred.JobClient:     Reduce input groups=19
12/03/16 19:32:28 INFO mapred.JobClient:     Map output materialized bytes=1860295
12/03/16 19:32:28 INFO mapred.JobClient:     Combine output records=19
12/03/16 19:32:28 INFO mapred.JobClient:     Map input records=21578
12/03/16 19:32:28 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/16 19:32:28 INFO mapred.JobClient:     Reduce output records=19
12/03/16 19:32:28 INFO mapred.JobClient:     Spilled Records=38
12/03/16 19:32:28 INFO mapred.JobClient:     Map output bytes=8268500
12/03/16 19:32:28 INFO mapred.JobClient:     Combine input records=21578
12/03/16 19:32:28 INFO mapred.JobClient:     Map output records=21578
12/03/16 19:32:28 INFO mapred.JobClient:     SPLIT_RAW_BYTES=154
12/03/16 19:32:28 INFO mapred.JobClient:     Reduce input records=19
12/03/16 19:32:28 INFO kmeans.KMeansDriver: K-Means Iteration 2
12/03/16 19:32:28 INFO input.FileInputFormat: Total input paths to process : 1
12/03/16 19:32:28 INFO mapred.JobClient: Running job: job_local_0002
12/03/16 19:32:28 INFO mapred.MapTask: io.sort.mb = 100
12/03/16 19:32:28 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/16 19:32:28 INFO mapred.MapTask: record buffer = 262144/327680
12/03/16 19:32:29 INFO mapred.JobClient:  map 0% reduce 0%
12/03/16 19:32:30 INFO mapred.MapTask: Starting flush of map output
12/03/16 19:32:30 INFO mapred.MapTask: Finished spill 0
12/03/16 19:32:30 INFO mapred.Task: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting
12/03/16 19:32:31 INFO mapred.LocalJobRunner: 
12/03/16 19:32:31 INFO mapred.Task: Task 'attempt_local_0002_m_000000_0' done.
12/03/16 19:32:31 INFO mapred.LocalJobRunner: 
12/03/16 19:32:31 INFO mapred.Merger: Merging 1 sorted segments
12/03/16 19:32:31 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2234659 bytes
12/03/16 19:32:31 INFO mapred.LocalJobRunner: 
12/03/16 19:32:32 INFO mapred.Task: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting
12/03/16 19:32:32 INFO mapred.LocalJobRunner: 
12/03/16 19:32:32 INFO mapred.Task: Task attempt_local_0002_r_000000_0 is allowed to commit now
12/03/16 19:32:32 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to /tmp/mahout-work-jenkins/reuters-kmeans/clusters-2
12/03/16 19:32:32 INFO mapred.JobClient:  map 100% reduce 0%
12/03/16 19:32:34 INFO mapred.LocalJobRunner: reduce > reduce
12/03/16 19:32:34 INFO mapred.Task: Task 'attempt_local_0002_r_000000_0' done.
12/03/16 19:32:35 INFO mapred.JobClient:  map 100% reduce 100%
12/03/16 19:32:35 INFO mapred.JobClient: Job complete: job_local_0002
12/03/16 19:32:35 INFO mapred.JobClient: Counters: 17
12/03/16 19:32:35 INFO mapred.JobClient:   File Output Format Counters 
12/03/16 19:32:35 INFO mapred.JobClient:     Bytes Written=2253801
12/03/16 19:32:35 INFO mapred.JobClient:   Clustering
12/03/16 19:32:35 INFO mapred.JobClient:     Converged Clusters=16
12/03/16 19:32:35 INFO mapred.JobClient:   FileSystemCounters
12/03/16 19:32:35 INFO mapred.JobClient:     FILE_BYTES_READ=139338762
12/03/16 19:32:35 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=111737799
12/03/16 19:32:35 INFO mapred.JobClient:   File Input Format Counters 
12/03/16 19:32:35 INFO mapred.JobClient:     Bytes Read=4914503
12/03/16 19:32:35 INFO mapred.JobClient:   Map-Reduce Framework
12/03/16 19:32:35 INFO mapred.JobClient:     Reduce input groups=19
12/03/16 19:32:35 INFO mapred.JobClient:     Map output materialized bytes=2234663
12/03/16 19:32:35 INFO mapred.JobClient:     Combine output records=19
12/03/16 19:32:35 INFO mapred.JobClient:     Map input records=21578
12/03/16 19:32:35 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/16 19:32:35 INFO mapred.JobClient:     Reduce output records=19
12/03/16 19:32:35 INFO mapred.JobClient:     Spilled Records=38
12/03/16 19:32:35 INFO mapred.JobClient:     Map output bytes=8268500
12/03/16 19:32:35 INFO mapred.JobClient:     Combine input records=21578
12/03/16 19:32:35 INFO mapred.JobClient:     Map output records=21578
12/03/16 19:32:35 INFO mapred.JobClient:     SPLIT_RAW_BYTES=154
12/03/16 19:32:35 INFO mapred.JobClient:     Reduce input records=19
12/03/16 19:32:35 INFO kmeans.KMeansDriver: K-Means Iteration 3
12/03/16 19:32:36 INFO input.FileInputFormat: Total input paths to process : 1
12/03/16 19:32:36 INFO mapred.JobClient: Running job: job_local_0003
12/03/16 19:32:36 INFO mapred.MapTask: io.sort.mb = 100
12/03/16 19:32:36 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/16 19:32:36 INFO mapred.MapTask: record buffer = 262144/327680
12/03/16 19:32:37 INFO mapred.JobClient:  map 0% reduce 0%
12/03/16 19:32:37 INFO mapred.MapTask: Starting flush of map output
12/03/16 19:32:37 INFO mapred.MapTask: Finished spill 0
12/03/16 19:32:37 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting
12/03/16 19:32:39 INFO mapred.LocalJobRunner: 
12/03/16 19:32:39 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0' done.
12/03/16 19:32:39 INFO mapred.LocalJobRunner: 
12/03/16 19:32:39 INFO mapred.Merger: Merging 1 sorted segments
12/03/16 19:32:39 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2625154 bytes
12/03/16 19:32:39 INFO mapred.LocalJobRunner: 
12/03/16 19:32:39 INFO mapred.Task: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting
12/03/16 19:32:39 INFO mapred.LocalJobRunner: 
12/03/16 19:32:39 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is allowed to commit now
12/03/16 19:32:39 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to /tmp/mahout-work-jenkins/reuters-kmeans/clusters-3
12/03/16 19:32:40 INFO mapred.JobClient:  map 100% reduce 0%
12/03/16 19:32:42 INFO mapred.LocalJobRunner: reduce > reduce
12/03/16 19:32:42 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done.
12/03/16 19:32:43 INFO mapred.JobClient:  map 100% reduce 100%
12/03/16 19:32:43 INFO mapred.JobClient: Job complete: job_local_0003
12/03/16 19:32:43 INFO mapred.JobClient: Counters: 17
12/03/16 19:32:43 INFO mapred.JobClient:   File Output Format Counters 
12/03/16 19:32:43 INFO mapred.JobClient:     Bytes Written=2647343
12/03/16 19:32:43 INFO mapred.JobClient:   Clustering
12/03/16 19:32:43 INFO mapred.JobClient:     Converged Clusters=19
12/03/16 19:32:43 INFO mapred.JobClient:   FileSystemCounters
12/03/16 19:32:43 INFO mapred.JobClient:     FILE_BYTES_READ=214762511
12/03/16 19:32:43 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=170654687
12/03/16 19:32:43 INFO mapred.JobClient:   File Input Format Counters 
12/03/16 19:32:43 INFO mapred.JobClient:     Bytes Read=4914503
12/03/16 19:32:43 INFO mapred.JobClient:   Map-Reduce Framework
12/03/16 19:32:43 INFO mapred.JobClient:     Reduce input groups=19
12/03/16 19:32:43 INFO mapred.JobClient:     Map output materialized bytes=2625158
12/03/16 19:32:43 INFO mapred.JobClient:     Combine output records=19
12/03/16 19:32:43 INFO mapred.JobClient:     Map input records=21578
12/03/16 19:32:43 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/16 19:32:43 INFO mapred.JobClient:     Reduce output records=19
12/03/16 19:32:43 INFO mapred.JobClient:     Spilled Records=38
12/03/16 19:32:43 INFO mapred.JobClient:     Map output bytes=8268500
12/03/16 19:32:43 INFO mapred.JobClient:     Combine input records=21578
12/03/16 19:32:43 INFO mapred.JobClient:     Map output records=21578
12/03/16 19:32:43 INFO mapred.JobClient:     SPLIT_RAW_BYTES=154
12/03/16 19:32:43 INFO mapred.JobClient:     Reduce input records=19
12/03/16 19:32:43 INFO kmeans.KMeansDriver: Clustering data
12/03/16 19:32:43 INFO kmeans.KMeansDriver: Running Clustering
12/03/16 19:32:43 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-jenkins/reuters-kmeans/clusters-3-final Out: /tmp/mahout-work-jenkins/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure@24de7d
12/03/16 19:32:43 INFO kmeans.KMeansDriver: convergence: 0.5 Input Vectors: org.apache.mahout.math.VectorWritable
12/03/16 19:32:43 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/03/16 19:32:43 INFO input.FileInputFormat: Total input paths to process : 1
12/03/16 19:32:43 INFO mapred.JobClient: Running job: job_local_0004
12/03/16 19:32:43 WARN mapred.LocalJobRunner: job_local_0004
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable
	at org.apache.mahout.clustering.classify.ClusterClassificationMapper.map(ClusterClassificationMapper.java:50)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
12/03/16 19:32:44 INFO mapred.JobClient:  map 0% reduce 0%
12/03/16 19:32:44 INFO mapred.JobClient: Job complete: job_local_0004
12/03/16 19:32:44 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.lang.InterruptedException: Cluster Classification Driver Job failed processing /tmp/mahout-work-jenkins/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:307)
	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:141)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.clusterData(KMeansDriver.java:435)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:159)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:114)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:63)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
Build step 'Execute shell' marked build as failure

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Isabel Drost <is...@apache.org>.

On Tue, Mar 20, 2012 at 8:20 AM, Isabel Drost <is...@apache.org> wrote:
> On 20.03.2012 Paritosh Ranjan wrote:
>> Is this info about the builds documented somewhere? If the information
>> about the builds is not documented anywhere, then I would will like to
>> add it to the mahout wiki/site/somewhere i.e. which all builds do we
>> have, and which one is needed for what purpose and other rules for them.
>> If it is already documented, can you please share that link?
>
> I know of know link - having that documentation including links to how to get
> access to Jenkins.

Sometimes fresh air helps remembering:
https://cwiki.apache.org/confluence/display/MAHOUT/Developer+Resources
is the hub page to developer documentation. Also check
http://tinyurl.com/7uh6tcd for some more information relevant to
committers only.

Isabel

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Isabel Drost <is...@apache.org>.

On 20.03.2012 Paritosh Ranjan wrote:
> I don't have the privilege to view/edit the Jenkins job configuration,
> so, I can not see the command from there.

http://wiki.apache.org/general/Jenkins <- explains how to get access as a 
committer.

> Somehow, I am not able to figure out the command from the console, can
> you help me here by telling the command ( for running the
> Mahout-Examples-Cluster-Reuters build )?

On the failed job click on "Console Output" - there click on "Full log" at the 
top. There I find

 ./examples/bin/cluster-reuters.sh 1
 ./examples/bin/cluster-reuters.sh 2

> Can you also help by clarifying the protocols on the builds? i.e. which
> builds to test before committing. How much time is allowed to fix
> failing builds ( how much time for which build ), or is there something
> like this?

There are no hard and fast rules for that. General rule of thumb: Don't break 
the build.

> Is it needed to run/build the Mahout-Examples-Cluster-Reuters ( along
> with Mahout Quality i.e. mvn clean install on trunk  ) before
> committing, and if yes, is it a common practice?

For me personally I at least run a mvn clean install - everything else depending 
on what parts of the project I changed.

> Is this info about the builds documented somewhere? If the information
> about the builds is not documented anywhere, then I would will like to
> add it to the mahout wiki/site/somewhere i.e. which all builds do we
> have, and which one is needed for what purpose and other rules for them.
> If it is already documented, can you please share that link?

I know of know link - having that documentation including links to how to get 
access to Jenkins.

We have gone over build issues multiple times in the past - it might make sense 
for you to also check the mailing list archives for more information on the 
history of what you find in place right now.

Isabel

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Paritosh Ranjan <pr...@xebia.com>.

I don't have the privilege to view/edit the Jenkins job configuration, 
so, I can not see the command from there.
Somehow, I am not able to figure out the command from the console, can 
you help me here by telling the command ( for running the 
Mahout-Examples-Cluster-Reuters build )?

Can you also help by clarifying the protocols on the builds? i.e. which 
builds to test before committing. How much time is allowed to fix 
failing builds ( how much time for which build ), or is there something 
like this?

Is it needed to run/build the Mahout-Examples-Cluster-Reuters ( along 
with Mahout Quality i.e. mvn clean install on trunk  ) before 
committing, and if yes, is it a common practice?

Is this info about the builds documented somewhere? If the information 
about the builds is not documented anywhere, then I would will like to 
add it to the mahout wiki/site/somewhere i.e. which all builds do we 
have, and which one is needed for what purpose and other rules for them. 
If it is already documented, can you please share that link?

I don't think Jenkins is behaving differently than my local box.

Paritosh

On 20-03-2012 02:51, Isabel Drost wrote:
> On 17.03.2012 Paritosh Ranjan wrote:
>> Is there any way to test this build before commit? The trunk is building
>> successfully and till now, that's all I check before commit. How do I
>> test this build before commit?
> As far as I know "all" you have to do is to use the same commands that Jenkins
> uses to trigger the build*. Don't know at the top of my head which options it
> uses, you should be able to find out by going to Jenkins, selecting our build
> and clicking configure (you need to be logged into Jenkins for that) or
> selecting the last build that failed and looking at its console output.
>
>  From just briefly scanning the output it looks like after successfully
> triggering the maven build it fails when executing the cluster_reuters.sh
> script.
>
>
> Isabel
>
> * Obvious other culprits for Jenkins behaving differently than you local box are
> different network settings, different maven versions, different settings.xml,
> different java version, screwed up local maven repositories on either side and
> such - however I don't think neither of those is particularly likely in this
> case.

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Isabel Drost <is...@apache.org>.

On 17.03.2012 Paritosh Ranjan wrote:
> Is there any way to test this build before commit? The trunk is building
> successfully and till now, that's all I check before commit. How do I
> test this build before commit?

As far as I know "all" you have to do is to use the same commands that Jenkins 
uses to trigger the build*. Don't know at the top of my head which options it 
uses, you should be able to find out by going to Jenkins, selecting our build 
and clicking configure (you need to be logged into Jenkins for that) or 
selecting the last build that failed and looking at its console output.

From just briefly scanning the output it looks like after successfully 
triggering the maven build it fails when executing the cluster_reuters.sh 
script.

Isabel

* Obvious other culprits for Jenkins behaving differently than you local box are 
different network settings, different maven versions, different settings.xml, 
different java version, screwed up local maven repositories on either side and 
such - however I don't think neither of those is particularly likely in this 
case.

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Paritosh Ranjan <pr...@xebia.com>.

I have found the error.

It should be
ClusterClassificationMapper extends Mapper<WritableComparable<?>

instead of
ClusterClassificationMapper extends Mapper<IntWritable>

I will run the tests and commit it as soon as get time.

On 18-03-2012 01:11, Paritosh Ranjan wrote:
> Is there any way to test this build before commit? The trunk is 
> building successfully and till now, that's all I check before commit. 
> How do I test this build before commit?
>
> PS: I have to attend a conference for next two days, so it is very 
> difficult for me to look into this issue in the next two days. I will 
> work on fixing it as soon as I get time. Sorry for the inconvenience.
> If someone else want's to look into it, then the hint would be : the 
> input vectors for clustering are having keys as Text while the 
> ClusterClassificatioinDriver expects IntWritable.
>
> Paritosh
>
> On 18-03-2012 00:50, Apache Jenkins Server wrote:
>> See<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/74/changes> 
>>
>>
>> Changes:
>>
>> [pranjan] MAHOUT-981, Added outlier removal option in method and CLI 
>> for KMeansDriver.
>>
>> [pranjan] MAHOUT-981, MAHOUT-983. Fixing test cases which fail 
>> intermittently.
>> Build is passing on my machine ( even for the last commit ).
>> Tried to identify all test cases, which can fail intermittently and 
>> fixed them.
>>
>> ------------------------------------------
>> [...truncated 6380 lines...]
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Map input records=21578
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce output records=21578
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Spilled Records=43156
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Map output bytes=17337483
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Combine input records=0
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Map output records=21578
>> 12/03/17 19:19:55 INFO mapred.JobClient:     SPLIT_RAW_BYTES=150
>> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce input records=21578
>> 12/03/17 19:19:55 INFO common.HadoopUtil: Deleting 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
>> 12/03/17 19:19:55 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/03/17 19:19:55 INFO mapred.JobClient: Running job: job_local_0007
>> 12/03/17 19:19:55 INFO mapred.MapTask: io.sort.mb = 100
>> 12/03/17 19:19:55 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 12/03/17 19:19:55 INFO mapred.MapTask: record buffer = 262144/327680
>> 12/03/17 19:19:56 INFO mapred.MapTask: Starting flush of map output
>> 12/03/17 19:19:56 INFO mapred.MapTask: Finished spill 0
>> 12/03/17 19:19:56 INFO mapred.Task: 
>> Task:attempt_local_0007_m_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:19:56 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
>> 12/03/17 19:19:58 INFO mapred.Task: Task 
>> 'attempt_local_0007_m_000000_0' done.
>> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
>> 12/03/17 19:19:58 INFO mapred.Merger: Merging 1 sorted segments
>> 12/03/17 19:19:58 INFO mapred.Merger: Down to the last merge-pass, 
>> with 1 segments left of total size: 4719200 bytes
>> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
>> 12/03/17 19:19:59 INFO mapred.Task: 
>> Task:attempt_local_0007_r_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:19:59 INFO mapred.LocalJobRunner:
>> 12/03/17 19:19:59 INFO mapred.Task: Task 
>> attempt_local_0007_r_000000_0 is allowed to commit now
>> 12/03/17 19:19:59 INFO output.FileOutputCommitter: Saved output of 
>> task 'attempt_local_0007_r_000000_0' to 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
>> 12/03/17 19:19:59 INFO mapred.JobClient:  map 100% reduce 0%
>> 12/03/17 19:20:01 INFO mapred.LocalJobRunner: reduce>  reduce
>> 12/03/17 19:20:01 INFO mapred.Task: Task 
>> 'attempt_local_0007_r_000000_0' done.
>> 12/03/17 19:20:02 INFO mapred.JobClient:  map 100% reduce 100%
>> 12/03/17 19:20:02 INFO mapred.JobClient: Job complete: job_local_0007
>> 12/03/17 19:20:02 INFO mapred.JobClient: Counters: 16
>> 12/03/17 19:20:02 INFO mapred.JobClient:   File Output Format Counters
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Written=4914503
>> 12/03/17 19:20:02 INFO mapred.JobClient:   FileSystemCounters
>> 12/03/17 19:20:02 INFO mapred.JobClient:     FILE_BYTES_READ=685549490
>> 12/03/17 19:20:02 INFO mapred.JobClient:     
>> FILE_BYTES_WRITTEN=595234075
>> 12/03/17 19:20:02 INFO mapred.JobClient:   File Input Format Counters
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Read=4914503
>> 12/03/17 19:20:02 INFO mapred.JobClient:   Map-Reduce Framework
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input groups=21578
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output materialized 
>> bytes=4719204
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Combine output records=0
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Map input records=21578
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce output records=21578
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Spilled Records=43156
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output bytes=4659281
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Combine input records=0
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output records=21578
>> 12/03/17 19:20:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=157
>> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input records=21578
>> 12/03/17 19:20:02 INFO common.HadoopUtil: Deleting 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
>> 12/03/17 19:20:02 INFO driver.MahoutDriver: Program took 78814 ms 
>> (Minutes: 1.3135666666666668)
>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
>> no HADOOP_HOME set, running locally
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in 
>> [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
>> SLF4J: Found binding in 
>> [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
>> SLF4J: Found binding in 
>> [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>> explanation.
>> 12/03/17 19:20:03 INFO common.AbstractJob: Command line arguments: 
>> {--clustering=null, 
>> --clusters=[/tmp/mahout-work-hudson/reuters-kmeans-clusters], 
>> --convergenceDelta=[0.5], 
>> --distanceMeasure=[org.apache.mahout.common.distance.CosineDistanceMeasure], 
>> --endPhase=[2147483647], 
>> --input=[/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/], 
>> --maxIter=[10], --method=[mapreduce], --numClusters=[20], 
>> --output=[/tmp/mahout-work-hudson/reuters-kmeans], --overwrite=null, 
>> --startPhase=[0], --tempDir=[temp]}
>> 12/03/17 19:20:03 INFO common.HadoopUtil: Deleting 
>> /tmp/mahout-work-hudson/reuters-kmeans
>> 12/03/17 19:20:03 INFO common.HadoopUtil: Deleting 
>> /tmp/mahout-work-hudson/reuters-kmeans-clusters
>> 12/03/17 19:20:03 WARN util.NativeCodeLoader: Unable to load 
>> native-hadoop library for your platform... using builtin-java classes 
>> where applicable
>> 12/03/17 19:20:03 INFO compress.CodecPool: Got brand-new compressor
>> 12/03/17 19:20:04 INFO kmeans.RandomSeedGenerator: Wrote 20 vectors 
>> to /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed
>> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: Input: 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters 
>> In: /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed 
>> Out: /tmp/mahout-work-hudson/reuters-kmeans Distance: 
>> org.apache.mahout.common.distance.CosineDistanceMeasure
>> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: convergence: 0.5 max 
>> Iterations: 10 num Reduce Tasks: 
>> org.apache.mahout.math.VectorWritable Input Vectors: {}
>> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: K-Means Iteration 1
>> 12/03/17 19:20:05 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/03/17 19:20:05 INFO mapred.JobClient: Running job: job_local_0001
>> 12/03/17 19:20:05 INFO mapred.MapTask: io.sort.mb = 100
>> 12/03/17 19:20:05 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 12/03/17 19:20:05 INFO mapred.MapTask: record buffer = 262144/327680
>> 12/03/17 19:20:05 INFO compress.CodecPool: Got brand-new decompressor
>> 12/03/17 19:20:06 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/17 19:20:06 INFO mapred.MapTask: Starting flush of map output
>> 12/03/17 19:20:07 INFO mapred.MapTask: Finished spill 0
>> 12/03/17 19:20:07 INFO mapred.Task: 
>> Task:attempt_local_0001_m_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:08 INFO mapred.Task: Task 
>> 'attempt_local_0001_m_000000_0' done.
>> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:08 INFO mapred.Merger: Merging 1 sorted segments
>> 12/03/17 19:20:08 INFO mapred.Merger: Down to the last merge-pass, 
>> with 1 segments left of total size: 1822221 bytes
>> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:08 INFO mapred.Task: 
>> Task:attempt_local_0001_r_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:08 INFO mapred.Task: Task 
>> attempt_local_0001_r_000000_0 is allowed to commit now
>> 12/03/17 19:20:08 INFO output.FileOutputCommitter: Saved output of 
>> task 'attempt_local_0001_r_000000_0' to 
>> /tmp/mahout-work-hudson/reuters-kmeans/clusters-1
>> 12/03/17 19:20:09 INFO mapred.JobClient:  map 100% reduce 0%
>> 12/03/17 19:20:11 INFO mapred.LocalJobRunner: reduce>  reduce
>> 12/03/17 19:20:11 INFO mapred.Task: Task 
>> 'attempt_local_0001_r_000000_0' done.
>> 12/03/17 19:20:12 INFO mapred.JobClient:  map 100% reduce 100%
>> 12/03/17 19:20:12 INFO mapred.JobClient: Job complete: job_local_0001
>> 12/03/17 19:20:12 INFO mapred.JobClient: Counters: 17
>> 12/03/17 19:20:12 INFO mapred.JobClient:   File Output Format Counters
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Written=1838132
>> 12/03/17 19:20:12 INFO mapred.JobClient:   Clustering
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Converged Clusters=9
>> 12/03/17 19:20:12 INFO mapred.JobClient:   FileSystemCounters
>> 12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_READ=69813433
>> 12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=54255754
>> 12/03/17 19:20:12 INFO mapred.JobClient:   File Input Format Counters
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Read=4914503
>> 12/03/17 19:20:12 INFO mapred.JobClient:   Map-Reduce Framework
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input groups=20
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output materialized 
>> bytes=1822225
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Combine output records=20
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Map input records=21578
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce output records=20
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Spilled Records=40
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output bytes=8268500
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Combine input records=21578
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output records=21578
>> 12/03/17 19:20:12 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
>> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input records=20
>> 12/03/17 19:20:12 INFO kmeans.KMeansDriver: K-Means Iteration 2
>> 12/03/17 19:20:12 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/03/17 19:20:12 INFO mapred.JobClient: Running job: job_local_0002
>> 12/03/17 19:20:12 INFO mapred.MapTask: io.sort.mb = 100
>> 12/03/17 19:20:12 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 12/03/17 19:20:12 INFO mapred.MapTask: record buffer = 262144/327680
>> 12/03/17 19:20:13 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/17 19:20:14 INFO mapred.MapTask: Starting flush of map output
>> 12/03/17 19:20:14 INFO mapred.MapTask: Finished spill 0
>> 12/03/17 19:20:14 INFO mapred.Task: 
>> Task:attempt_local_0002_m_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:15 INFO mapred.Task: Task 
>> 'attempt_local_0002_m_000000_0' done.
>> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:15 INFO mapred.Merger: Merging 1 sorted segments
>> 12/03/17 19:20:15 INFO mapred.Merger: Down to the last merge-pass, 
>> with 1 segments left of total size: 2218574 bytes
>> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:15 INFO mapred.Task: 
>> Task:attempt_local_0002_r_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:15 INFO mapred.Task: Task 
>> attempt_local_0002_r_000000_0 is allowed to commit now
>> 12/03/17 19:20:15 INFO output.FileOutputCommitter: Saved output of 
>> task 'attempt_local_0002_r_000000_0' to 
>> /tmp/mahout-work-hudson/reuters-kmeans/clusters-2
>> 12/03/17 19:20:16 INFO mapred.JobClient:  map 100% reduce 0%
>> 12/03/17 19:20:18 INFO mapred.LocalJobRunner: reduce>  reduce
>> 12/03/17 19:20:18 INFO mapred.Task: Task 
>> 'attempt_local_0002_r_000000_0' done.
>> 12/03/17 19:20:19 INFO mapred.JobClient:  map 100% reduce 100%
>> 12/03/17 19:20:19 INFO mapred.JobClient: Job complete: job_local_0002
>> 12/03/17 19:20:19 INFO mapred.JobClient: Counters: 17
>> 12/03/17 19:20:19 INFO mapred.JobClient:   File Output Format Counters
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Written=2237676
>> 12/03/17 19:20:19 INFO mapred.JobClient:   Clustering
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Converged Clusters=17
>> 12/03/17 19:20:19 INFO mapred.JobClient:   FileSystemCounters
>> 12/03/17 19:20:19 INFO mapred.JobClient:     FILE_BYTES_READ=139118294
>> 12/03/17 19:20:19 INFO mapred.JobClient:     
>> FILE_BYTES_WRITTEN=111531798
>> 12/03/17 19:20:19 INFO mapred.JobClient:   File Input Format Counters
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Read=4914503
>> 12/03/17 19:20:19 INFO mapred.JobClient:   Map-Reduce Framework
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input groups=20
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output materialized 
>> bytes=2218578
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Combine output records=20
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Map input records=21578
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce output records=20
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Spilled Records=40
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output bytes=8268500
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Combine input records=21578
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output records=21578
>> 12/03/17 19:20:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
>> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input records=20
>> 12/03/17 19:20:19 INFO kmeans.KMeansDriver: K-Means Iteration 3
>> 12/03/17 19:20:19 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/03/17 19:20:19 INFO mapred.JobClient: Running job: job_local_0003
>> 12/03/17 19:20:19 INFO mapred.MapTask: io.sort.mb = 100
>> 12/03/17 19:20:19 INFO mapred.MapTask: data buffer = 79691776/99614720
>> 12/03/17 19:20:19 INFO mapred.MapTask: record buffer = 262144/327680
>> 12/03/17 19:20:20 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/17 19:20:21 INFO mapred.MapTask: Starting flush of map output
>> 12/03/17 19:20:21 INFO mapred.MapTask: Finished spill 0
>> 12/03/17 19:20:21 INFO mapred.Task: 
>> Task:attempt_local_0003_m_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:22 INFO mapred.Task: Task 
>> 'attempt_local_0003_m_000000_0' done.
>> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:22 INFO mapred.Merger: Merging 1 sorted segments
>> 12/03/17 19:20:22 INFO mapred.Merger: Down to the last merge-pass, 
>> with 1 segments left of total size: 2639503 bytes
>> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:23 INFO mapred.Task: 
>> Task:attempt_local_0003_r_000000_0 is done. And is in the process of 
>> commiting
>> 12/03/17 19:20:23 INFO mapred.LocalJobRunner:
>> 12/03/17 19:20:23 INFO mapred.Task: Task 
>> attempt_local_0003_r_000000_0 is allowed to commit now
>> 12/03/17 19:20:23 INFO output.FileOutputCommitter: Saved output of 
>> task 'attempt_local_0003_r_000000_0' to 
>> /tmp/mahout-work-hudson/reuters-kmeans/clusters-3
>> 12/03/17 19:20:23 INFO mapred.JobClient:  map 100% reduce 0%
>> 12/03/17 19:20:25 INFO mapred.LocalJobRunner: reduce>  reduce
>> 12/03/17 19:20:25 INFO mapred.Task: Task 
>> 'attempt_local_0003_r_000000_0' done.
>> 12/03/17 19:20:26 INFO mapred.JobClient:  map 100% reduce 100%
>> 12/03/17 19:20:26 INFO mapred.JobClient: Job complete: job_local_0003
>> 12/03/17 19:20:26 INFO mapred.JobClient: Counters: 17
>> 12/03/17 19:20:26 INFO mapred.JobClient:   File Output Format Counters
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Written=2661888
>> 12/03/17 19:20:26 INFO mapred.JobClient:   Clustering
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Converged Clusters=20
>> 12/03/17 19:20:26 INFO mapred.JobClient:   FileSystemCounters
>> 12/03/17 19:20:26 INFO mapred.JobClient:     FILE_BYTES_READ=214459475
>> 12/03/17 19:20:26 INFO mapred.JobClient:     
>> FILE_BYTES_WRITTEN=170473464
>> 12/03/17 19:20:26 INFO mapred.JobClient:   File Input Format Counters
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Read=4914503
>> 12/03/17 19:20:26 INFO mapred.JobClient:   Map-Reduce Framework
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input groups=20
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output materialized 
>> bytes=2639507
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Combine output records=20
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Map input records=21578
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce shuffle bytes=0
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce output records=20
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Spilled Records=40
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output bytes=8268500
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Combine input records=21578
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output records=21578
>> 12/03/17 19:20:26 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
>> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input records=20
>> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Clustering data
>> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Running Clustering
>> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Input: 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters 
>> In: /tmp/mahout-work-hudson/reuters-kmeans/clusters-3-final Out: 
>> /tmp/mahout-work-hudson/reuters-kmeans Distance: 
>> org.apache.mahout.common.distance.CosineDistanceMeasure@1b32627
>> 12/03/17 19:20:26 WARN mapred.JobClient: Use GenericOptionsParser for 
>> parsing the arguments. Applications should implement Tool for the same.
>> 12/03/17 19:20:27 INFO input.FileInputFormat: Total input paths to 
>> process : 1
>> 12/03/17 19:20:27 INFO mapred.JobClient: Running job: job_local_0004
>> 12/03/17 19:20:27 WARN mapred.LocalJobRunner: job_local_0004
>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be 
>> cast to org.apache.hadoop.io.IntWritable
>>     at 
>> org.apache.mahout.clustering.classify.ClusterClassificationMapper.map(ClusterClassificationMapper.java:50)
>>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>     at 
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>> 12/03/17 19:20:28 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/17 19:20:28 INFO mapred.JobClient: Job complete: job_local_0004
>> 12/03/17 19:20:28 INFO mapred.JobClient: Counters: 0
>> Exception in thread "main" java.lang.InterruptedException: Cluster 
>> Classification Driver Job failed processing 
>> /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
>>     at 
>> org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:307)
>>     at 
>> org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:141)
>>     at 
>> org.apache.mahout.clustering.kmeans.KMeansDriver.clusterData(KMeansDriver.java:457)
>>     at 
>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:172)
>>     at 
>> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:119)
>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>     at 
>> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:63)
>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>     at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>     at 
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>     at 
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>> Build step 'Execute shell' marked build as failure
>

Re: Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Paritosh Ranjan <pr...@xebia.com>.

Is there any way to test this build before commit? The trunk is building 
successfully and till now, that's all I check before commit. How do I 
test this build before commit?

PS: I have to attend a conference for next two days, so it is very 
difficult for me to look into this issue in the next two days. I will 
work on fixing it as soon as I get time. Sorry for the inconvenience.
If someone else want's to look into it, then the hint would be : the 
input vectors for clustering are having keys as Text while the 
ClusterClassificatioinDriver expects IntWritable.

Paritosh

On 18-03-2012 00:50, Apache Jenkins Server wrote:
> See<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/74/changes>
>
> Changes:
>
> [pranjan] MAHOUT-981, Added outlier removal option in method and CLI for KMeansDriver.
>
> [pranjan] MAHOUT-981, MAHOUT-983. Fixing test cases which fail intermittently.
> Build is passing on my machine ( even for the last commit ).
> Tried to identify all test cases, which can fail intermittently and fixed them.
>
> ------------------------------------------
> [...truncated 6380 lines...]
> 12/03/17 19:19:55 INFO mapred.JobClient:     Map input records=21578
> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce output records=21578
> 12/03/17 19:19:55 INFO mapred.JobClient:     Spilled Records=43156
> 12/03/17 19:19:55 INFO mapred.JobClient:     Map output bytes=17337483
> 12/03/17 19:19:55 INFO mapred.JobClient:     Combine input records=0
> 12/03/17 19:19:55 INFO mapred.JobClient:     Map output records=21578
> 12/03/17 19:19:55 INFO mapred.JobClient:     SPLIT_RAW_BYTES=150
> 12/03/17 19:19:55 INFO mapred.JobClient:     Reduce input records=21578
> 12/03/17 19:19:55 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
> 12/03/17 19:19:55 INFO input.FileInputFormat: Total input paths to process : 1
> 12/03/17 19:19:55 INFO mapred.JobClient: Running job: job_local_0007
> 12/03/17 19:19:55 INFO mapred.MapTask: io.sort.mb = 100
> 12/03/17 19:19:55 INFO mapred.MapTask: data buffer = 79691776/99614720
> 12/03/17 19:19:55 INFO mapred.MapTask: record buffer = 262144/327680
> 12/03/17 19:19:56 INFO mapred.MapTask: Starting flush of map output
> 12/03/17 19:19:56 INFO mapred.MapTask: Finished spill 0
> 12/03/17 19:19:56 INFO mapred.Task: Task:attempt_local_0007_m_000000_0 is done. And is in the process of commiting
> 12/03/17 19:19:56 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
> 12/03/17 19:19:58 INFO mapred.Task: Task 'attempt_local_0007_m_000000_0' done.
> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
> 12/03/17 19:19:58 INFO mapred.Merger: Merging 1 sorted segments
> 12/03/17 19:19:58 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 4719200 bytes
> 12/03/17 19:19:58 INFO mapred.LocalJobRunner:
> 12/03/17 19:19:59 INFO mapred.Task: Task:attempt_local_0007_r_000000_0 is done. And is in the process of commiting
> 12/03/17 19:19:59 INFO mapred.LocalJobRunner:
> 12/03/17 19:19:59 INFO mapred.Task: Task attempt_local_0007_r_000000_0 is allowed to commit now
> 12/03/17 19:19:59 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0007_r_000000_0' to /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
> 12/03/17 19:19:59 INFO mapred.JobClient:  map 100% reduce 0%
> 12/03/17 19:20:01 INFO mapred.LocalJobRunner: reduce>  reduce
> 12/03/17 19:20:01 INFO mapred.Task: Task 'attempt_local_0007_r_000000_0' done.
> 12/03/17 19:20:02 INFO mapred.JobClient:  map 100% reduce 100%
> 12/03/17 19:20:02 INFO mapred.JobClient: Job complete: job_local_0007
> 12/03/17 19:20:02 INFO mapred.JobClient: Counters: 16
> 12/03/17 19:20:02 INFO mapred.JobClient:   File Output Format Counters
> 12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Written=4914503
> 12/03/17 19:20:02 INFO mapred.JobClient:   FileSystemCounters
> 12/03/17 19:20:02 INFO mapred.JobClient:     FILE_BYTES_READ=685549490
> 12/03/17 19:20:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=595234075
> 12/03/17 19:20:02 INFO mapred.JobClient:   File Input Format Counters
> 12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Read=4914503
> 12/03/17 19:20:02 INFO mapred.JobClient:   Map-Reduce Framework
> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input groups=21578
> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output materialized bytes=4719204
> 12/03/17 19:20:02 INFO mapred.JobClient:     Combine output records=0
> 12/03/17 19:20:02 INFO mapred.JobClient:     Map input records=21578
> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce output records=21578
> 12/03/17 19:20:02 INFO mapred.JobClient:     Spilled Records=43156
> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output bytes=4659281
> 12/03/17 19:20:02 INFO mapred.JobClient:     Combine input records=0
> 12/03/17 19:20:02 INFO mapred.JobClient:     Map output records=21578
> 12/03/17 19:20:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=157
> 12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input records=21578
> 12/03/17 19:20:02 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
> 12/03/17 19:20:02 INFO driver.MahoutDriver: Program took 78814 ms (Minutes: 1.3135666666666668)
> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> no HADOOP_HOME set, running locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> 12/03/17 19:20:03 INFO common.AbstractJob: Command line arguments: {--clustering=null, --clusters=[/tmp/mahout-work-hudson/reuters-kmeans-clusters], --convergenceDelta=[0.5], --distanceMeasure=[org.apache.mahout.common.distance.CosineDistanceMeasure], --endPhase=[2147483647], --input=[/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/], --maxIter=[10], --method=[mapreduce], --numClusters=[20], --output=[/tmp/mahout-work-hudson/reuters-kmeans], --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> 12/03/17 19:20:03 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-kmeans
> 12/03/17 19:20:03 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-kmeans-clusters
> 12/03/17 19:20:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> 12/03/17 19:20:03 INFO compress.CodecPool: Got brand-new compressor
> 12/03/17 19:20:04 INFO kmeans.RandomSeedGenerator: Wrote 20 vectors to /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed
> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed Out: /tmp/mahout-work-hudson/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure
> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
> 12/03/17 19:20:04 INFO kmeans.KMeansDriver: K-Means Iteration 1
> 12/03/17 19:20:05 INFO input.FileInputFormat: Total input paths to process : 1
> 12/03/17 19:20:05 INFO mapred.JobClient: Running job: job_local_0001
> 12/03/17 19:20:05 INFO mapred.MapTask: io.sort.mb = 100
> 12/03/17 19:20:05 INFO mapred.MapTask: data buffer = 79691776/99614720
> 12/03/17 19:20:05 INFO mapred.MapTask: record buffer = 262144/327680
> 12/03/17 19:20:05 INFO compress.CodecPool: Got brand-new decompressor
> 12/03/17 19:20:06 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/17 19:20:06 INFO mapred.MapTask: Starting flush of map output
> 12/03/17 19:20:07 INFO mapred.MapTask: Finished spill 0
> 12/03/17 19:20:07 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:08 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:08 INFO mapred.Merger: Merging 1 sorted segments
> 12/03/17 19:20:08 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1822221 bytes
> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:08 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:08 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:08 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
> 12/03/17 19:20:08 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-1
> 12/03/17 19:20:09 INFO mapred.JobClient:  map 100% reduce 0%
> 12/03/17 19:20:11 INFO mapred.LocalJobRunner: reduce>  reduce
> 12/03/17 19:20:11 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
> 12/03/17 19:20:12 INFO mapred.JobClient:  map 100% reduce 100%
> 12/03/17 19:20:12 INFO mapred.JobClient: Job complete: job_local_0001
> 12/03/17 19:20:12 INFO mapred.JobClient: Counters: 17
> 12/03/17 19:20:12 INFO mapred.JobClient:   File Output Format Counters
> 12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Written=1838132
> 12/03/17 19:20:12 INFO mapred.JobClient:   Clustering
> 12/03/17 19:20:12 INFO mapred.JobClient:     Converged Clusters=9
> 12/03/17 19:20:12 INFO mapred.JobClient:   FileSystemCounters
> 12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_READ=69813433
> 12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=54255754
> 12/03/17 19:20:12 INFO mapred.JobClient:   File Input Format Counters
> 12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Read=4914503
> 12/03/17 19:20:12 INFO mapred.JobClient:   Map-Reduce Framework
> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input groups=20
> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output materialized bytes=1822225
> 12/03/17 19:20:12 INFO mapred.JobClient:     Combine output records=20
> 12/03/17 19:20:12 INFO mapred.JobClient:     Map input records=21578
> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce output records=20
> 12/03/17 19:20:12 INFO mapred.JobClient:     Spilled Records=40
> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output bytes=8268500
> 12/03/17 19:20:12 INFO mapred.JobClient:     Combine input records=21578
> 12/03/17 19:20:12 INFO mapred.JobClient:     Map output records=21578
> 12/03/17 19:20:12 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
> 12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input records=20
> 12/03/17 19:20:12 INFO kmeans.KMeansDriver: K-Means Iteration 2
> 12/03/17 19:20:12 INFO input.FileInputFormat: Total input paths to process : 1
> 12/03/17 19:20:12 INFO mapred.JobClient: Running job: job_local_0002
> 12/03/17 19:20:12 INFO mapred.MapTask: io.sort.mb = 100
> 12/03/17 19:20:12 INFO mapred.MapTask: data buffer = 79691776/99614720
> 12/03/17 19:20:12 INFO mapred.MapTask: record buffer = 262144/327680
> 12/03/17 19:20:13 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/17 19:20:14 INFO mapred.MapTask: Starting flush of map output
> 12/03/17 19:20:14 INFO mapred.MapTask: Finished spill 0
> 12/03/17 19:20:14 INFO mapred.Task: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:15 INFO mapred.Task: Task 'attempt_local_0002_m_000000_0' done.
> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:15 INFO mapred.Merger: Merging 1 sorted segments
> 12/03/17 19:20:15 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2218574 bytes
> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:15 INFO mapred.Task: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:15 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:15 INFO mapred.Task: Task attempt_local_0002_r_000000_0 is allowed to commit now
> 12/03/17 19:20:15 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-2
> 12/03/17 19:20:16 INFO mapred.JobClient:  map 100% reduce 0%
> 12/03/17 19:20:18 INFO mapred.LocalJobRunner: reduce>  reduce
> 12/03/17 19:20:18 INFO mapred.Task: Task 'attempt_local_0002_r_000000_0' done.
> 12/03/17 19:20:19 INFO mapred.JobClient:  map 100% reduce 100%
> 12/03/17 19:20:19 INFO mapred.JobClient: Job complete: job_local_0002
> 12/03/17 19:20:19 INFO mapred.JobClient: Counters: 17
> 12/03/17 19:20:19 INFO mapred.JobClient:   File Output Format Counters
> 12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Written=2237676
> 12/03/17 19:20:19 INFO mapred.JobClient:   Clustering
> 12/03/17 19:20:19 INFO mapred.JobClient:     Converged Clusters=17
> 12/03/17 19:20:19 INFO mapred.JobClient:   FileSystemCounters
> 12/03/17 19:20:19 INFO mapred.JobClient:     FILE_BYTES_READ=139118294
> 12/03/17 19:20:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=111531798
> 12/03/17 19:20:19 INFO mapred.JobClient:   File Input Format Counters
> 12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Read=4914503
> 12/03/17 19:20:19 INFO mapred.JobClient:   Map-Reduce Framework
> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input groups=20
> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output materialized bytes=2218578
> 12/03/17 19:20:19 INFO mapred.JobClient:     Combine output records=20
> 12/03/17 19:20:19 INFO mapred.JobClient:     Map input records=21578
> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce output records=20
> 12/03/17 19:20:19 INFO mapred.JobClient:     Spilled Records=40
> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output bytes=8268500
> 12/03/17 19:20:19 INFO mapred.JobClient:     Combine input records=21578
> 12/03/17 19:20:19 INFO mapred.JobClient:     Map output records=21578
> 12/03/17 19:20:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
> 12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input records=20
> 12/03/17 19:20:19 INFO kmeans.KMeansDriver: K-Means Iteration 3
> 12/03/17 19:20:19 INFO input.FileInputFormat: Total input paths to process : 1
> 12/03/17 19:20:19 INFO mapred.JobClient: Running job: job_local_0003
> 12/03/17 19:20:19 INFO mapred.MapTask: io.sort.mb = 100
> 12/03/17 19:20:19 INFO mapred.MapTask: data buffer = 79691776/99614720
> 12/03/17 19:20:19 INFO mapred.MapTask: record buffer = 262144/327680
> 12/03/17 19:20:20 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/17 19:20:21 INFO mapred.MapTask: Starting flush of map output
> 12/03/17 19:20:21 INFO mapred.MapTask: Finished spill 0
> 12/03/17 19:20:21 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:22 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0' done.
> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:22 INFO mapred.Merger: Merging 1 sorted segments
> 12/03/17 19:20:22 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2639503 bytes
> 12/03/17 19:20:22 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:23 INFO mapred.Task: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting
> 12/03/17 19:20:23 INFO mapred.LocalJobRunner:
> 12/03/17 19:20:23 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is allowed to commit now
> 12/03/17 19:20:23 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-3
> 12/03/17 19:20:23 INFO mapred.JobClient:  map 100% reduce 0%
> 12/03/17 19:20:25 INFO mapred.LocalJobRunner: reduce>  reduce
> 12/03/17 19:20:25 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done.
> 12/03/17 19:20:26 INFO mapred.JobClient:  map 100% reduce 100%
> 12/03/17 19:20:26 INFO mapred.JobClient: Job complete: job_local_0003
> 12/03/17 19:20:26 INFO mapred.JobClient: Counters: 17
> 12/03/17 19:20:26 INFO mapred.JobClient:   File Output Format Counters
> 12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Written=2661888
> 12/03/17 19:20:26 INFO mapred.JobClient:   Clustering
> 12/03/17 19:20:26 INFO mapred.JobClient:     Converged Clusters=20
> 12/03/17 19:20:26 INFO mapred.JobClient:   FileSystemCounters
> 12/03/17 19:20:26 INFO mapred.JobClient:     FILE_BYTES_READ=214459475
> 12/03/17 19:20:26 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=170473464
> 12/03/17 19:20:26 INFO mapred.JobClient:   File Input Format Counters
> 12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Read=4914503
> 12/03/17 19:20:26 INFO mapred.JobClient:   Map-Reduce Framework
> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input groups=20
> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output materialized bytes=2639507
> 12/03/17 19:20:26 INFO mapred.JobClient:     Combine output records=20
> 12/03/17 19:20:26 INFO mapred.JobClient:     Map input records=21578
> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce shuffle bytes=0
> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce output records=20
> 12/03/17 19:20:26 INFO mapred.JobClient:     Spilled Records=40
> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output bytes=8268500
> 12/03/17 19:20:26 INFO mapred.JobClient:     Combine input records=21578
> 12/03/17 19:20:26 INFO mapred.JobClient:     Map output records=21578
> 12/03/17 19:20:26 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
> 12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input records=20
> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Clustering data
> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Running Clustering
> 12/03/17 19:20:26 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-hudson/reuters-kmeans/clusters-3-final Out: /tmp/mahout-work-hudson/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure@1b32627
> 12/03/17 19:20:26 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
> 12/03/17 19:20:27 INFO input.FileInputFormat: Total input paths to process : 1
> 12/03/17 19:20:27 INFO mapred.JobClient: Running job: job_local_0004
> 12/03/17 19:20:27 WARN mapred.LocalJobRunner: job_local_0004
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable
> 	at org.apache.mahout.clustering.classify.ClusterClassificationMapper.map(ClusterClassificationMapper.java:50)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> 12/03/17 19:20:28 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/17 19:20:28 INFO mapred.JobClient: Job complete: job_local_0004
> 12/03/17 19:20:28 INFO mapred.JobClient: Counters: 0
> Exception in thread "main" java.lang.InterruptedException: Cluster Classification Driver Job failed processing /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
> 	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:307)
> 	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:141)
> 	at org.apache.mahout.clustering.kmeans.KMeansDriver.clusterData(KMeansDriver.java:457)
> 	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:172)
> 	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:119)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:63)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> 	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> 	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
> Build step 'Execute shell' marked build as failure

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters #74

Posted by Apache Jenkins Server <je...@builds.apache.org>.

See <https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/74/changes>

Changes:

[pranjan] MAHOUT-981, Added outlier removal option in method and CLI for KMeansDriver.

[pranjan] MAHOUT-981, MAHOUT-983. Fixing test cases which fail intermittently. 
Build is passing on my machine ( even for the last commit ). 
Tried to identify all test cases, which can fail intermittently and fixed them.

------------------------------------------
[...truncated 6380 lines...]
12/03/17 19:19:55 INFO mapred.JobClient:     Map input records=21578
12/03/17 19:19:55 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/17 19:19:55 INFO mapred.JobClient:     Reduce output records=21578
12/03/17 19:19:55 INFO mapred.JobClient:     Spilled Records=43156
12/03/17 19:19:55 INFO mapred.JobClient:     Map output bytes=17337483
12/03/17 19:19:55 INFO mapred.JobClient:     Combine input records=0
12/03/17 19:19:55 INFO mapred.JobClient:     Map output records=21578
12/03/17 19:19:55 INFO mapred.JobClient:     SPLIT_RAW_BYTES=150
12/03/17 19:19:55 INFO mapred.JobClient:     Reduce input records=21578
12/03/17 19:19:55 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/03/17 19:19:55 INFO input.FileInputFormat: Total input paths to process : 1
12/03/17 19:19:55 INFO mapred.JobClient: Running job: job_local_0007
12/03/17 19:19:55 INFO mapred.MapTask: io.sort.mb = 100
12/03/17 19:19:55 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/17 19:19:55 INFO mapred.MapTask: record buffer = 262144/327680
12/03/17 19:19:56 INFO mapred.MapTask: Starting flush of map output
12/03/17 19:19:56 INFO mapred.MapTask: Finished spill 0
12/03/17 19:19:56 INFO mapred.Task: Task:attempt_local_0007_m_000000_0 is done. And is in the process of commiting
12/03/17 19:19:56 INFO mapred.JobClient:  map 0% reduce 0%
12/03/17 19:19:58 INFO mapred.LocalJobRunner: 
12/03/17 19:19:58 INFO mapred.Task: Task 'attempt_local_0007_m_000000_0' done.
12/03/17 19:19:58 INFO mapred.LocalJobRunner: 
12/03/17 19:19:58 INFO mapred.Merger: Merging 1 sorted segments
12/03/17 19:19:58 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 4719200 bytes
12/03/17 19:19:58 INFO mapred.LocalJobRunner: 
12/03/17 19:19:59 INFO mapred.Task: Task:attempt_local_0007_r_000000_0 is done. And is in the process of commiting
12/03/17 19:19:59 INFO mapred.LocalJobRunner: 
12/03/17 19:19:59 INFO mapred.Task: Task attempt_local_0007_r_000000_0 is allowed to commit now
12/03/17 19:19:59 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0007_r_000000_0' to /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/03/17 19:19:59 INFO mapred.JobClient:  map 100% reduce 0%
12/03/17 19:20:01 INFO mapred.LocalJobRunner: reduce > reduce
12/03/17 19:20:01 INFO mapred.Task: Task 'attempt_local_0007_r_000000_0' done.
12/03/17 19:20:02 INFO mapred.JobClient:  map 100% reduce 100%
12/03/17 19:20:02 INFO mapred.JobClient: Job complete: job_local_0007
12/03/17 19:20:02 INFO mapred.JobClient: Counters: 16
12/03/17 19:20:02 INFO mapred.JobClient:   File Output Format Counters 
12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Written=4914503
12/03/17 19:20:02 INFO mapred.JobClient:   FileSystemCounters
12/03/17 19:20:02 INFO mapred.JobClient:     FILE_BYTES_READ=685549490
12/03/17 19:20:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=595234075
12/03/17 19:20:02 INFO mapred.JobClient:   File Input Format Counters 
12/03/17 19:20:02 INFO mapred.JobClient:     Bytes Read=4914503
12/03/17 19:20:02 INFO mapred.JobClient:   Map-Reduce Framework
12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input groups=21578
12/03/17 19:20:02 INFO mapred.JobClient:     Map output materialized bytes=4719204
12/03/17 19:20:02 INFO mapred.JobClient:     Combine output records=0
12/03/17 19:20:02 INFO mapred.JobClient:     Map input records=21578
12/03/17 19:20:02 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/17 19:20:02 INFO mapred.JobClient:     Reduce output records=21578
12/03/17 19:20:02 INFO mapred.JobClient:     Spilled Records=43156
12/03/17 19:20:02 INFO mapred.JobClient:     Map output bytes=4659281
12/03/17 19:20:02 INFO mapred.JobClient:     Combine input records=0
12/03/17 19:20:02 INFO mapred.JobClient:     Map output records=21578
12/03/17 19:20:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=157
12/03/17 19:20:02 INFO mapred.JobClient:     Reduce input records=21578
12/03/17 19:20:02 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/03/17 19:20:02 INFO driver.MahoutDriver: Program took 78814 ms (Minutes: 1.3135666666666668)
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
no HADOOP_HOME set, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/mahout-examples-0.7-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]>
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
12/03/17 19:20:03 INFO common.AbstractJob: Command line arguments: {--clustering=null, --clusters=[/tmp/mahout-work-hudson/reuters-kmeans-clusters], --convergenceDelta=[0.5], --distanceMeasure=[org.apache.mahout.common.distance.CosineDistanceMeasure], --endPhase=[2147483647], --input=[/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/], --maxIter=[10], --method=[mapreduce], --numClusters=[20], --output=[/tmp/mahout-work-hudson/reuters-kmeans], --overwrite=null, --startPhase=[0], --tempDir=[temp]}
12/03/17 19:20:03 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-kmeans
12/03/17 19:20:03 INFO common.HadoopUtil: Deleting /tmp/mahout-work-hudson/reuters-kmeans-clusters
12/03/17 19:20:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
12/03/17 19:20:03 INFO compress.CodecPool: Got brand-new compressor
12/03/17 19:20:04 INFO kmeans.RandomSeedGenerator: Wrote 20 vectors to /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed
12/03/17 19:20:04 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed Out: /tmp/mahout-work-hudson/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure
12/03/17 19:20:04 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
12/03/17 19:20:04 INFO kmeans.KMeansDriver: K-Means Iteration 1
12/03/17 19:20:05 INFO input.FileInputFormat: Total input paths to process : 1
12/03/17 19:20:05 INFO mapred.JobClient: Running job: job_local_0001
12/03/17 19:20:05 INFO mapred.MapTask: io.sort.mb = 100
12/03/17 19:20:05 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/17 19:20:05 INFO mapred.MapTask: record buffer = 262144/327680
12/03/17 19:20:05 INFO compress.CodecPool: Got brand-new decompressor
12/03/17 19:20:06 INFO mapred.JobClient:  map 0% reduce 0%
12/03/17 19:20:06 INFO mapred.MapTask: Starting flush of map output
12/03/17 19:20:07 INFO mapred.MapTask: Finished spill 0
12/03/17 19:20:07 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
12/03/17 19:20:08 INFO mapred.LocalJobRunner: 
12/03/17 19:20:08 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
12/03/17 19:20:08 INFO mapred.LocalJobRunner: 
12/03/17 19:20:08 INFO mapred.Merger: Merging 1 sorted segments
12/03/17 19:20:08 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1822221 bytes
12/03/17 19:20:08 INFO mapred.LocalJobRunner: 
12/03/17 19:20:08 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
12/03/17 19:20:08 INFO mapred.LocalJobRunner: 
12/03/17 19:20:08 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
12/03/17 19:20:08 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-1
12/03/17 19:20:09 INFO mapred.JobClient:  map 100% reduce 0%
12/03/17 19:20:11 INFO mapred.LocalJobRunner: reduce > reduce
12/03/17 19:20:11 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
12/03/17 19:20:12 INFO mapred.JobClient:  map 100% reduce 100%
12/03/17 19:20:12 INFO mapred.JobClient: Job complete: job_local_0001
12/03/17 19:20:12 INFO mapred.JobClient: Counters: 17
12/03/17 19:20:12 INFO mapred.JobClient:   File Output Format Counters 
12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Written=1838132
12/03/17 19:20:12 INFO mapred.JobClient:   Clustering
12/03/17 19:20:12 INFO mapred.JobClient:     Converged Clusters=9
12/03/17 19:20:12 INFO mapred.JobClient:   FileSystemCounters
12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_READ=69813433
12/03/17 19:20:12 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=54255754
12/03/17 19:20:12 INFO mapred.JobClient:   File Input Format Counters 
12/03/17 19:20:12 INFO mapred.JobClient:     Bytes Read=4914503
12/03/17 19:20:12 INFO mapred.JobClient:   Map-Reduce Framework
12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input groups=20
12/03/17 19:20:12 INFO mapred.JobClient:     Map output materialized bytes=1822225
12/03/17 19:20:12 INFO mapred.JobClient:     Combine output records=20
12/03/17 19:20:12 INFO mapred.JobClient:     Map input records=21578
12/03/17 19:20:12 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/17 19:20:12 INFO mapred.JobClient:     Reduce output records=20
12/03/17 19:20:12 INFO mapred.JobClient:     Spilled Records=40
12/03/17 19:20:12 INFO mapred.JobClient:     Map output bytes=8268500
12/03/17 19:20:12 INFO mapred.JobClient:     Combine input records=21578
12/03/17 19:20:12 INFO mapred.JobClient:     Map output records=21578
12/03/17 19:20:12 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
12/03/17 19:20:12 INFO mapred.JobClient:     Reduce input records=20
12/03/17 19:20:12 INFO kmeans.KMeansDriver: K-Means Iteration 2
12/03/17 19:20:12 INFO input.FileInputFormat: Total input paths to process : 1
12/03/17 19:20:12 INFO mapred.JobClient: Running job: job_local_0002
12/03/17 19:20:12 INFO mapred.MapTask: io.sort.mb = 100
12/03/17 19:20:12 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/17 19:20:12 INFO mapred.MapTask: record buffer = 262144/327680
12/03/17 19:20:13 INFO mapred.JobClient:  map 0% reduce 0%
12/03/17 19:20:14 INFO mapred.MapTask: Starting flush of map output
12/03/17 19:20:14 INFO mapred.MapTask: Finished spill 0
12/03/17 19:20:14 INFO mapred.Task: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting
12/03/17 19:20:15 INFO mapred.LocalJobRunner: 
12/03/17 19:20:15 INFO mapred.Task: Task 'attempt_local_0002_m_000000_0' done.
12/03/17 19:20:15 INFO mapred.LocalJobRunner: 
12/03/17 19:20:15 INFO mapred.Merger: Merging 1 sorted segments
12/03/17 19:20:15 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2218574 bytes
12/03/17 19:20:15 INFO mapred.LocalJobRunner: 
12/03/17 19:20:15 INFO mapred.Task: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting
12/03/17 19:20:15 INFO mapred.LocalJobRunner: 
12/03/17 19:20:15 INFO mapred.Task: Task attempt_local_0002_r_000000_0 is allowed to commit now
12/03/17 19:20:15 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-2
12/03/17 19:20:16 INFO mapred.JobClient:  map 100% reduce 0%
12/03/17 19:20:18 INFO mapred.LocalJobRunner: reduce > reduce
12/03/17 19:20:18 INFO mapred.Task: Task 'attempt_local_0002_r_000000_0' done.
12/03/17 19:20:19 INFO mapred.JobClient:  map 100% reduce 100%
12/03/17 19:20:19 INFO mapred.JobClient: Job complete: job_local_0002
12/03/17 19:20:19 INFO mapred.JobClient: Counters: 17
12/03/17 19:20:19 INFO mapred.JobClient:   File Output Format Counters 
12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Written=2237676
12/03/17 19:20:19 INFO mapred.JobClient:   Clustering
12/03/17 19:20:19 INFO mapred.JobClient:     Converged Clusters=17
12/03/17 19:20:19 INFO mapred.JobClient:   FileSystemCounters
12/03/17 19:20:19 INFO mapred.JobClient:     FILE_BYTES_READ=139118294
12/03/17 19:20:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=111531798
12/03/17 19:20:19 INFO mapred.JobClient:   File Input Format Counters 
12/03/17 19:20:19 INFO mapred.JobClient:     Bytes Read=4914503
12/03/17 19:20:19 INFO mapred.JobClient:   Map-Reduce Framework
12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input groups=20
12/03/17 19:20:19 INFO mapred.JobClient:     Map output materialized bytes=2218578
12/03/17 19:20:19 INFO mapred.JobClient:     Combine output records=20
12/03/17 19:20:19 INFO mapred.JobClient:     Map input records=21578
12/03/17 19:20:19 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/17 19:20:19 INFO mapred.JobClient:     Reduce output records=20
12/03/17 19:20:19 INFO mapred.JobClient:     Spilled Records=40
12/03/17 19:20:19 INFO mapred.JobClient:     Map output bytes=8268500
12/03/17 19:20:19 INFO mapred.JobClient:     Combine input records=21578
12/03/17 19:20:19 INFO mapred.JobClient:     Map output records=21578
12/03/17 19:20:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
12/03/17 19:20:19 INFO mapred.JobClient:     Reduce input records=20
12/03/17 19:20:19 INFO kmeans.KMeansDriver: K-Means Iteration 3
12/03/17 19:20:19 INFO input.FileInputFormat: Total input paths to process : 1
12/03/17 19:20:19 INFO mapred.JobClient: Running job: job_local_0003
12/03/17 19:20:19 INFO mapred.MapTask: io.sort.mb = 100
12/03/17 19:20:19 INFO mapred.MapTask: data buffer = 79691776/99614720
12/03/17 19:20:19 INFO mapred.MapTask: record buffer = 262144/327680
12/03/17 19:20:20 INFO mapred.JobClient:  map 0% reduce 0%
12/03/17 19:20:21 INFO mapred.MapTask: Starting flush of map output
12/03/17 19:20:21 INFO mapred.MapTask: Finished spill 0
12/03/17 19:20:21 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting
12/03/17 19:20:22 INFO mapred.LocalJobRunner: 
12/03/17 19:20:22 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0' done.
12/03/17 19:20:22 INFO mapred.LocalJobRunner: 
12/03/17 19:20:22 INFO mapred.Merger: Merging 1 sorted segments
12/03/17 19:20:22 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 2639503 bytes
12/03/17 19:20:22 INFO mapred.LocalJobRunner: 
12/03/17 19:20:23 INFO mapred.Task: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting
12/03/17 19:20:23 INFO mapred.LocalJobRunner: 
12/03/17 19:20:23 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is allowed to commit now
12/03/17 19:20:23 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to /tmp/mahout-work-hudson/reuters-kmeans/clusters-3
12/03/17 19:20:23 INFO mapred.JobClient:  map 100% reduce 0%
12/03/17 19:20:25 INFO mapred.LocalJobRunner: reduce > reduce
12/03/17 19:20:25 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done.
12/03/17 19:20:26 INFO mapred.JobClient:  map 100% reduce 100%
12/03/17 19:20:26 INFO mapred.JobClient: Job complete: job_local_0003
12/03/17 19:20:26 INFO mapred.JobClient: Counters: 17
12/03/17 19:20:26 INFO mapred.JobClient:   File Output Format Counters 
12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Written=2661888
12/03/17 19:20:26 INFO mapred.JobClient:   Clustering
12/03/17 19:20:26 INFO mapred.JobClient:     Converged Clusters=20
12/03/17 19:20:26 INFO mapred.JobClient:   FileSystemCounters
12/03/17 19:20:26 INFO mapred.JobClient:     FILE_BYTES_READ=214459475
12/03/17 19:20:26 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=170473464
12/03/17 19:20:26 INFO mapred.JobClient:   File Input Format Counters 
12/03/17 19:20:26 INFO mapred.JobClient:     Bytes Read=4914503
12/03/17 19:20:26 INFO mapred.JobClient:   Map-Reduce Framework
12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input groups=20
12/03/17 19:20:26 INFO mapred.JobClient:     Map output materialized bytes=2639507
12/03/17 19:20:26 INFO mapred.JobClient:     Combine output records=20
12/03/17 19:20:26 INFO mapred.JobClient:     Map input records=21578
12/03/17 19:20:26 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/03/17 19:20:26 INFO mapred.JobClient:     Reduce output records=20
12/03/17 19:20:26 INFO mapred.JobClient:     Spilled Records=40
12/03/17 19:20:26 INFO mapred.JobClient:     Map output bytes=8268500
12/03/17 19:20:26 INFO mapred.JobClient:     Combine input records=21578
12/03/17 19:20:26 INFO mapred.JobClient:     Map output records=21578
12/03/17 19:20:26 INFO mapred.JobClient:     SPLIT_RAW_BYTES=153
12/03/17 19:20:26 INFO mapred.JobClient:     Reduce input records=20
12/03/17 19:20:26 INFO kmeans.KMeansDriver: Clustering data
12/03/17 19:20:26 INFO kmeans.KMeansDriver: Running Clustering
12/03/17 19:20:26 INFO kmeans.KMeansDriver: Input: /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters In: /tmp/mahout-work-hudson/reuters-kmeans/clusters-3-final Out: /tmp/mahout-work-hudson/reuters-kmeans Distance: org.apache.mahout.common.distance.CosineDistanceMeasure@1b32627
12/03/17 19:20:26 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/03/17 19:20:27 INFO input.FileInputFormat: Total input paths to process : 1
12/03/17 19:20:27 INFO mapred.JobClient: Running job: job_local_0004
12/03/17 19:20:27 WARN mapred.LocalJobRunner: job_local_0004
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.io.IntWritable
	at org.apache.mahout.clustering.classify.ClusterClassificationMapper.map(ClusterClassificationMapper.java:50)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
12/03/17 19:20:28 INFO mapred.JobClient:  map 0% reduce 0%
12/03/17 19:20:28 INFO mapred.JobClient: Job complete: job_local_0004
12/03/17 19:20:28 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.lang.InterruptedException: Cluster Classification Driver Job failed processing /tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:307)
	at org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:141)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.clusterData(KMeansDriver.java:457)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:172)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:119)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:63)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
Build step 'Execute shell' marked build as failure