You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2003/10/09 16:17:57 UTC
GermanAnalyzer.java GermanStemmer.java
Moving to lucene-user list.
If not the author, maybe some users of this code can tell us how this
uppercase/lowercase business should work.
And the issue even includes patches. I don't use the German* stuff, so
I'm afraid of applying it and breaking things for people who do use
German* classes as they are currently.
Otis
--- Erik Hatcher <er...@ehatchersolutions.com> wrote:
> It seems to be the issue mentioned here as well:
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18410
>
>
> On Wednesday, October 8, 2003, at 09:41 PM, Otis Gospodnetic wrote:
> > Answer to question comment: possibly because nouns start with a
> capital
> > letter in German, so lowercasing may not be the right thing to do.
> > This is a bit of a guess. Maybe the author will enlighten us. :)
> >
> > Otis
> >
> > --- ehatcher@apache.org wrote:
> >> ehatcher 2003/10/08 17:08:52
> >> Revision Changes Path
> >> 1.7 +3 -2
> >>
> > jakarta-lucene/src/java/org/apache/lucene/analysis/de/
> > GermanAnalyzer.java
> >>
> >> Index: GermanAnalyzer.java
> >>
> ===================================================================
> >> RCS file:
> >>
> > /home/cvs/jakarta-lucene/src/java/org/apache/lucene/analysis/de/
> > GermanAnalyzer.java,v
> >> retrieving revision 1.6
> >> retrieving revision 1.7
> >> diff -u -r1.6 -r1.7
> >> --- GermanAnalyzer.java 29 Jan 2003 17:18:53 -0000 1.6
> >> +++ GermanAnalyzer.java 9 Oct 2003 00:08:52 -0000 1.7
> >> @@ -169,7 +169,8 @@
> >> {
> >> TokenStream result = new StandardTokenizer( reader );
> >> result = new StandardFilter( result );
> >> - result = new StopFilter( result, stoptable );
> >> + // shouldn't there be a lowercaser before stop word
> filtering?
> >> + result = new StopFilter( result, stoptable );
> >> result = new GermanStemFilter( result, excltable );
> >> return result;
> >> }
> >>
> >>
> >>
__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: GermanAnalyzer.java GermanStemmer.java
Posted by Gerhard Schwarz <ge...@fpg.de>.
Hello,
Otis Gospodnetic schrieb:
> If not the author, maybe some users of this code can tell us how this
> uppercase/lowercase business should work.
Lowercasefilter before Stopfilter would be fine. The only reason not to
do so was a semantic issue. But that issue is not important for the use
with Lucene. The stemmer was written for an application I use, for the
use with lucene (for normal fulltext inquiry at all) I can be simpler.
> And the issue even includes patches. I don't use the German* stuff, so
> I'm afraid of applying it and breaking things for people who do use
> German* classes as they are currently.
After months of work I now have spare time.
The patch I promised months ago comes by next week. (No promise this
time, or tomorrow something bad will happen in my office...)
I hope that the updated stemmer will fully work with existing indices.
Gerhard (back on Java development)
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org