You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by yauza <ag...@gmail.com> on 2017/05/10 09:50:52 UTC

Need help in understanding solr clustering component

 I was looking(in process of making my own) into solr's default clustering
component for carrot2. In the clustering component class there are 2 methods
where the clustering algorithms are called:

in the overridden process method
SolrDocumentList solrDocList = SolrPluginUtils.docListToSolrDocumentList(
results.docList, rb.req.getSearcher(),
engine.getFieldsToLoad(rb.req),docIds);
Object clusters = engine.cluster(rb.getQuery(), solrDocList, docIds,
rb.req);
rb.rsp.add("clusters", clusters);

And once again in the finishStage method

Map<SolrDocument,Integer> docIds = null;
Object clusters = engine.cluster(rb.getQuery(), solrDocList, docIds,
rb.req);
rb.rsp.add("clusters", clusters);

Now my question is the process method works not on the complete result query
but on the shards and finish stage once when all the results have been
aggregated, then why does we call the clustering algorithms twice and adding
it to the resulted cluster?  Am I missing something?  
Wont it create too many labels if in the worst case none of the cluster
labels match?


P.S Please correct me if I am wrong.



--
View this message in context: http://lucene.472066.n3.nabble.com/Need-help-in-understanding-solr-clustering-component-tp4334400.html
Sent from the Solr - User mailing list archive at Nabble.com.