You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2009/10/21 20:11:47 UTC

[Solr Wiki] Trivial Update of "ClusteringComponent" by YonikSeeley

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "ClusteringComponent" page has been changed by YonikSeeley.
http://wiki.apache.org/solr/ClusteringComponent?action=diff&rev1=30&rev2=31

--------------------------------------------------

  
  = Clustering Component =
  
- The clustering implements a pluggable approach that allows for the implementation of any clustering engine.  
+ The clustering implements a pluggable approach that allows for the implementation of any clustering engine.
- 
- See https://issues.apache.org/jira/browse/SOLR-769
  
  The !ClusteringComponent is responsible for taking in the request, identify the clustering engine to be used (a !SolrClusteringEngine implementation) and then delegating the work to that engine.  Once the engine is done, the results are then added to the response.
  
@@ -124, +122 @@

  The thing to note here is the mapping of Solr Fields (name, id, etc.) to the Carrot2 needs of title, snippet and url. Clustering will take into account the text of title and snippet.
  
  Next, inputting a query that turns on clustering (clustering=true:
+ {{{
- {{{http://localhost:8983/solr/select?indent=on&q=*:*&rows=10&clustering=true}}}
+ http://localhost:8983/solr/select?indent=on&q=*:*&rows=10&clustering=true
+ }}}
  
  yields the results like:
  {{{
@@ -163, +163 @@

  
  Clusters produced by Carrot2 group the results into different product categories: DDR (memory), Car Power Adapter, Display, Hard Drive. Notice that, depending on the quality of input documents, some clusters may not make much sense.
  
- See also ClusteringFullResultsExample.
- 
  == Tuning Carrot2 clustering ==
  
  The easiest way to tune Carrot2 clustering for your specific data is to use a dedicated Carrot2 tool called Document Clustering Workbench.