You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/18 12:14:24 UTC

[Solr Wiki] Update of "AnalyzersTokenizersTokenFilters" by JanHoydahl

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "AnalyzersTokenizersTokenFilters" page has been changed by JanHoydahl.
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters?action=diff&rev1=83&rev2=84

--------------------------------------------------

  
  == Stemming ==
  
- There are three types of stemming strategies:
+ There are four types of stemming strategies:
     * [[http://tartarus.org/~martin/PorterStemmer/|Porter]] or Reduction stemming &#151; A transforming algorithm that reduces any of the forms of a word such as "runs, running, ran", to its elemental root e.g., "run". Porter stemming must be performed ''both'' at insertion time and at query time.
+    * [[http://code.google.com/p/lucene-hunspell/|Lucene-Hunspell]] aims to provide features such as stemming, decompounding, spellchecking, normalization, term expansion, etc. taking advantage of the existing lexical resources already created and widely-used in projects like OpenOffice. This is still alpha-version but with an impressive list of supported languages (See [[http://lucene-eurocon.org/sessions-track2-day2.html#5|this presentation]] for more)
     * Expansion stemming &#151; Takes a root word and 'expands' it to all of its various forms &#151; can be used ''either'' at insertion time ''or'' at query time.  One way to approach this is by using the [[#SynonymFilter|SynonymFilterFactory]]
     * [[/Kstem|KStem]], an alternative to Porter for developers looking for a less agressive stemmer.