You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2012/06/04 20:19:24 UTC

[Solr Wiki] Update of "SpellCheckComponent" by JamesDyer

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SpellCheckComponent" page has been changed by JamesDyer:
http://wiki.apache.org/solr/SpellCheckComponent?action=diff&rev1=56&rev2=57

Comment:
SOLR-2993:  WordBreakSolrSpellChecker

      <!-- Require terms to occur in 1/100th of 1% of documents in order to be included in the dictionary -->
      <float name="thresholdTokenFrequency">.0001</float>
    </lst>
+   <!-- a spellchecker that can break or combine words. (Solr 4.0 see SOLR-2993) -->
+   <lst name="spellchecker">
+     <str name="name">wordbreak</str>
+     <str name="classname">solr.WordBreakSolrSpellChecker</str>      
+     <str name="field">spell</str>
+     <str name="combineWords">true</str>
+     <str name="breakWords">true</str>
+     <int name="maxChanges">3</int>
+   </lst>
    <!-- Example of using different distance measure -->
    <lst name="spellchecker">
      <str name="name">jarowinkler</str>
@@ -77, +86 @@

      <!-- Use a different Distance Measure -->
      <str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
      <str name="spellcheckIndexDir">./spellchecker</str>
- 
    </lst>
  
    <!-- This field type's analyzer is used by the QueryConverter to tokenize the value for "q" parameter -->
@@ -101, +109 @@

    <lst name="defaults">
      <!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
      <str name="spellcheck.dictionary">default</str>
+     <!-- Also generate Word Break Suggestions (Solr 4.0 see SOLR-2993) -->
+     <str name="spellcheck.dictionary">wordbreak</str>
      <!-- omp = Only More Popular -->
      <str name="spellcheck.onlyMorePopular">false</str>
      <!-- exr = Extended Results -->
      <str name="spellcheck.extendedResults">false</str>
      <!--  The number of suggestions to return -->
-     <str name="spellcheck.count">1</str>
+     <str name="spellcheck.count">10</str>
    </lst>
    <!--  Add to a RequestHandler
         !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@@ -126, +136 @@

   * org.apache.solr.spelling.IndexBasedSpellChecker -- Create and use a spelling dictionary that is based on the Solr index or an existing Lucene index
   * org.apache.solr.spelling.FileBasedSpellChecker -- Create and use a spelling dictionary based off a flat file.  This can be useful for using Solr as a spelling server or in other instances when spelling suggestions do not need to be based on the content of an actual index.
   * org.apache.solr.spelling.DirectSolrSpellChecker <!> [[Solr4.0]] -- Experimental spellchecker that only uses your main Solr index directly (build/rebuild is a no-op). See [[https://issues.apache.org/jira/browse/LUCENE-2507|LUCENE-2507]] for more information.
- 
+  * org.apache.solr.spelling.WordBreakSolrSpellChecker <!> [[Solr4.0]] -- Generates suggestions by Combining adjacent words and/or breaking words into multiples.  This spellchecker can be configured with a traditional checker (ie: DirectSolrSpellChecker).  The results are combined and collations can contain a mix of corrections from both spellcheckers. See [[https://issues.apache.org/jira/browse/SOLR-2993|SOLR-2993]] for more information.
  == Custom Comparators and the Lucene Spell Checkers (IndexBasedSpellChecker, FileBasedSpellChecker, DirectSolrSpellChecker) ==
  <!> [[Solr3.1]] [[Solr4.0]]