You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by chte <ch...@kth.se> on 2013/03/26 02:38:41 UTC

Difference between Kmeans and Fuzzy Kmeans output

Hello, 

I have a time trying to find documentation on how the output from
Fuzzy-Kmeans is generated as SequenceFile when a clustered document belongs
to several clusters.

Im quite new to Mahout in general and is trying to use Mahout fin a
proof-of-concept program for school project. I'm currently testing out Frank
Scholten's clustering code on Stackoverflow posts. See
https://github.com/frankscholten/mahout-clustering-stackoverflow for full
implementation. 

The implementation is based on k-means and I am trying to figure out how to
cluster the posts into soft clusters by using Fuzzy-Kmeans. In the end I
want to index the data with SOLR so that "posts" entries contain a
multivalued field with ids of which clusters a post belongs to. 

The issue at hand is in fact that I do not seem to figure out in which part
of the code that I should modify in order to achieve this. If someone could
shed some light on this I would be very grateful!





--
View this message in context: http://lucene.472066.n3.nabble.com/Difference-between-Kmeans-and-Fuzzy-Kmeans-output-tp4051265.html
Sent from the Mahout User List mailing list archive at Nabble.com.