You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Boris Goldowsky <bo...@alum.mit.edu> on 2004/04/11 19:55:32 UTC
Stemming options
Has anyone on the list implemented a dictionary-based English stemmer
with Lucene? Perhaps based on the freely-available ispell dictionaries
or something like that? The Porter and Snowball stemmers have not
worked that well for our application, but it is a bit daunting to start
from scratch in developing an alternate stemmer.
Alternatively, is there an algorithmic stemmer that anyone has used
which is a little less aggressive than the Porter algorithm? We've been
having problems with searches for "conversion" returning "converse" and
"conversational"; and "animal" returning "animate". Yes, these are
morphologically related, but in our particular application it would be
better to stick with removing simple inflections.
Thanks for any pointers --
Boris
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org