You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Karl Wettin <ka...@gmail.com> on 2008/03/27 14:34:16 UTC

Index clustering (was: kMeans)

Khalil Honsali skrev:
> Hello,

Hi Khalil,

> Is there any relevant papers/work about index-clustering (not search results
> clustering) ? I wonder if it will impact queries if index is clustered and
> distributed somehow?

LUCENE-1025 is a heirarchial clusterer that I later refactored to be 
persist the tree in a BDB so I could build a cluster of a complete index 
that could come up with "more like this"-suggestions in an instant. It 
was sort of slow, but the results where not too bad. Never compared it 
with anything else thogh. It never became more than a proof of concept.

I'm looking at reimplenting this for Mahout, but I have a hard time 
figuring out if building the tree is something one wants to (or even if 
one can do) using map reduce. The more I think of it there more I want 
to solve it with a grid.



     karl