You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by rm...@apache.org on 2010/10/29 16:45:29 UTC

svn commit: r1028779 - in /lucene/java/branches/lucene_3_0/contrib: CHANGES.txt analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java

Author: rmuir
Date: Fri Oct 29 14:45:29 2010
New Revision: 1028779

URL: http://svn.apache.org/viewvc?rev=1028779&view=rev
Log:
LUCENE-2055: add documentation notes about buggy stemmers

Modified:
    lucene/java/branches/lucene_3_0/contrib/CHANGES.txt
    lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java
    lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java

Modified: lucene/java/branches/lucene_3_0/contrib/CHANGES.txt
URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_3_0/contrib/CHANGES.txt?rev=1028779&r1=1028778&r2=1028779&view=diff
==============================================================================
--- lucene/java/branches/lucene_3_0/contrib/CHANGES.txt (original)
+++ lucene/java/branches/lucene_3_0/contrib/CHANGES.txt Fri Oct 29 14:45:29 2010
@@ -10,6 +10,13 @@ Bug Fixes
  * LUCENE-2284: MatchAllDocsQueryNode toString() created an invalid XML tag.
    (Frank Wesemann via Robert Muir)
 
+Documentation
+
+ * LUCENE-2055: Add documentation noting that the Dutch and French stemmers
+   in contrib/analyzers do not implement the Snowball algorithm correctly,
+   and recommend to use the equivalents in contrib/snowball if possible. 
+   (Robert Muir, Uwe Schindler, Simon Willnauer)
+
 ======================= Release 3.0.2 2010-06-18 =======================
 
 No changes.

Modified: lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java
URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java?rev=1028779&r1=1028778&r2=1028779&view=diff
==============================================================================
--- lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java (original)
+++ lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/fr/FrenchStemFilter.java Fri Oct 29 14:45:29 2010
@@ -33,6 +33,10 @@ import java.util.Set;
  * not be stemmed at all. The used stemmer can be changed at runtime after the
  * filter object is created (as long as it is a {@link FrenchStemmer}).
  * </p>
+ * NOTE: This stemmer does not implement the Snowball algorithm correctly,
+ * especially involving case problems. It is recommended that you consider using
+ * the "French" stemmer in the snowball package instead. This stemmer will likely
+ * be deprecated in a future release.
  */
 public final class FrenchStemFilter extends TokenFilter {
 

Modified: lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java
URL: http://svn.apache.org/viewvc/lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java?rev=1028779&r1=1028778&r2=1028779&view=diff
==============================================================================
--- lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java (original)
+++ lucene/java/branches/lucene_3_0/contrib/analyzers/common/src/java/org/apache/lucene/analysis/nl/DutchStemFilter.java Fri Oct 29 14:45:29 2010
@@ -34,6 +34,10 @@ import org.apache.lucene.analysis.tokena
  * not be stemmed at all. The stemmer used can be changed at runtime after the
  * filter object is created (as long as it is a {@link DutchStemmer}).
  * </p>
+ * NOTE: This stemmer does not implement the Snowball algorithm correctly,
+ * specifically doubled consonants. It is recommended that you consider using
+ * the "Dutch" stemmer in the snowball package instead. This stemmer will likely
+ * be deprecated in a future release.
  */
 public final class DutchStemFilter extends TokenFilter {
   /**