You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Boris Goldowsky <bo...@alum.mit.edu> on 2004/04/11 19:55:32 UTC

Stemming options

Has anyone on the list implemented a dictionary-based English stemmer
with Lucene?  Perhaps based on the freely-available ispell dictionaries
or something like that?  The Porter and Snowball stemmers have not
worked that well for our application, but it is a bit daunting to start
from scratch in developing an alternate stemmer.

Alternatively, is there an algorithmic stemmer that anyone has used
which is a little less aggressive than the Porter algorithm?  We've been
having problems with searches for "conversion" returning "converse" and
"conversational"; and "animal" returning "animate".  Yes, these are
morphologically related, but in our particular application it would be
better to stick with removing simple inflections.

Thanks for any pointers --

Boris



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org