You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Mark <st...@gmail.com> on 2011/06/18 22:30:12 UTC

Generated clusters... now what?

I am playing around with clustering and I just generated my clusters 
using KMeans. I'm able to view my clusters using clusterdump and they 
appear to be pretty good. My question is now what can I do with this 
data other than just inspect it using clusterdump?

For example how can I ask:
     "What cluster does document #1 belong to?"
     "What are all the documents belonging to cluster X?"

Thanks

RE: Generated clusters... now what?

Posted by Jeff Eastman <je...@Narus.com>.
Did you run kmeans with the -cl argument? This will run a clustering post-step which will classify each of your documents into "clusteredPoints". That directory will give you the answers to the questions you asked below. See https://cwiki.apache.org/confluence/display/MAHOUT/K-Means+Clustering (esp. Running k-Means Clustering).

-----Original Message-----
From: Mark [mailto:static.void.dev@gmail.com] 
Sent: Saturday, June 18, 2011 1:30 PM
To: Mahout User List
Subject: Generated clusters... now what?

I am playing around with clustering and I just generated my clusters 
using KMeans. I'm able to view my clusters using clusterdump and they 
appear to be pretty good. My question is now what can I do with this 
data other than just inspect it using clusterdump?

For example how can I ask:
     "What cluster does document #1 belong to?"
     "What are all the documents belonging to cluster X?"

Thanks