You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Selvam <s....@gmail.com> on 2012/02/08 13:28:50 UTC

Custom Document Clustering and Mahout Integration

Hi all,

I am trying to write a custom document clustering component that should
take all the docs in commit and cluster them; Solr Version:3.5.0

Main Class:
public class KMeansClusteringEngine extends DocumentClusteringEngine
implements SolrEventListener

I added newSearcher event listener, that works as expected. But, when is
the document clustering called ?, I have two functions of
DocumentClusteringEngine in my custom code, but when do they get called ?,
wiki page says to add clustering.collection=true, but I am not sure as my
guess is document clustering noway related to search.

  public NamedList cluster(SolrParams params)
  public NamedList cluster(DocSet docSet, SolrParams solrParams)


Note:
Actually I am trying to integrate Solr 3.5 with Mahout 0.5 for incremental
clustering (i.e mapping new docs to existing cluster to avoid complete
re-clustering ) basing my work from this github code,
https://github.com/gsingers/ApacheCon2010/blob/master/src/main/java/com/grantingersoll/intell/clustering/KMeansClusteringEngine.java
.

I would love to get some support from you.

-- 
Regards,
S.Selvam
http://knackforge.com