You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Chris Harrington <ch...@heystaks.com> on 2013/01/30 13:10:07 UTC

MiA NewsKMeansClustering Example Help

Hi all, 

I'm new to Mahout and I've been going through the MiA book, lately I've been trying Chapter 10's example of NewsKMeansClustering as it looks like a good starting point for my own stuff but I've run into a problem just trying to run and view the output.

I'm trying to view the output of running the java file via the cluster dump utility but all I get out of it is an empty text file.

I'm using MiA-mahout-0.6 and mahout-distribution-0.6. This is the process I went trough to get to this point.
Get the reuters data and put it into seqfiles.  (I issue these commands to bin/mahout in the mahout-distribution-0.6 project)
mvn -e -q exec:java -Dexec.mainClass="org.apache.lucene.benchmark.utils.ExtractReuters" -Dexec.args="reuters/ reuters-extracted/"
bin/mahout seqdirectory -c UTF-8 -i examples/reuters-extracted/ -o reuters-seqfiles
I (manually - drag and drop) move the  seq files to MiA (0.6) project into the folder reuters-seqfiles.
I then run MiA example of NewsKMeansClustering from chapter 10 which results in a folder newsClusters being created and populated with various files (clusters folder, dictionary.file-0, centroids folder, etc)
There doesn't appear to be any unusual errors in the console
2013-01-30 11:15:42.593 java[11011:1903] Unable to load realm info from SCDynamicStore
SLF4J: The requested version 1.5.11 by your slf4j binding is not compatible with [1.6]
SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details.
2013-01-30 11:15:45 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
. (same as above line)
.
.
2013-01-30 11:16:55 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2013-01-30 11:16:56 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
.(same as above line)
.
.
I then run the cluster dump command to create an output.txt file.
../mahout-distribution-0.6/bin/mahout clusterdump -s newsClusters/clusters/clusters-19/ -o output.txt -d newsClusters/dictionary.file-0 -dt sequencefile -n 10
but all this does is create an empty text file.

Any help would be much appreciated.

Thanks,
Chris








Re: MiA NewsKMeansClustering Example Help

Posted by Chris Harrington <ch...@heystaks.com>.
Yeah I just did, seems there was something very wrong with my setup or I did something foolish during setup, anyway I removed it all (mahout and hadoop) and started from scratch and it's working now. Sorry for the trouble.

On 30 Jan 2013, at 19:51, Robin Anil wrote:

> Could you try it with 0.7 version.


Re: MiA NewsKMeansClustering Example Help

Posted by Robin Anil <ro...@gmail.com>.
Could you try it with 0.7 version.