You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2009/09/08 21:25:38 UTC
[Solr Wiki] Update of "ClusteringComponent" by StanislawOsinski
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The following page has been changed by StanislawOsinski:
http://wiki.apache.org/solr/ClusteringComponent
The comment on the change is:
Updates to the Carrot2 clustering tuning procedure
------------------------------------------------------------------------------
1. [http://project.carrot2.org/download.html Download Carrot2 Document Clustering Workbench] for your platform.
2. [http://download.carrot2.org/head/manual/#section.getting-started.solr Attach] your Solr instance as a document source in the Workbench.
- 3. [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning Fine tune] stop words, stop labels and possibly [http://download.carrot2.org/head/manual/#section.component.lingo other attributes] of the clustering algorithms to suit your needs.
+ 3. [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning.stop-words Fine tune stop words], [http://download.carrot2.org/head/manual/#section.advanced-topics.fine-tuning.stop-regexps stop labels] and possibly [http://download.carrot2.org/head/manual/#section.component.lingo other attributes] of the clustering algorithms to suit your needs.
+ 4. To transfer the modified `stopwords.*` and `stoplabels.*` files to your Solr instance, simply make the modified files accessible in the classpath. If you're using the Solr example scripts, try putting the files in the `example/resources` folder (Jetty starter from `start.jar` adds all files from that folder to the classpath). Alternatively, you can overwrite the corresponding `stopwords.*` and `stoplabels.*` files directly in `carrot2-mini-*.jar`.
- 4. To transfer the modified stopwords.* and stoplabels.* files to your Solr instance, simply make the modified files accessible in the classpath. If you're using the Solr example scripts, try:
-
- {{{
- java -cp <dir-with-your-modified-stopwords> -Dsolr.solr.home=./clustering/solr -jar start.jar
- }}}
= Document Clustering =