You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Shey Rab Pawo <pa...@gmail.com> on 2005/06/01 16:10:18 UTC
Re: Stemming at Query time
If your stemmer worked on indexing, then won't the "breath" entry
automatically pick up all of these? So, isn't the project unnecessary
and otiose?
On 5/31/05, Daniel Naber <lu...@danielnaber.de> wrote:
> On Monday 30 May 2005 18:54, Andrew Boyd wrote:
>
> > Now that the QueryParser knows about position increments has anyone
> > used this to do stemming at query time and not at indexing time? I
> > suppose one would need a reverse stemmer. Given the query breath it
> > would need to inject breathe, breathes, breathing etc.
>
> There are two things to consider: queries will get more complicated and
> thus slower and the implementation isn't that easy: while stemming can be
> done with a simple algorithm (for English), you'll need a dictionary with
> at least part-of-speech information for adding suffixes. That's because
> you cannot just add "ing" to any word, otherwise you'd end up with car +
> ing = caring. (But once you have this dictionary the quality of your
> solution can be better than that of a stemmer, as stemmers also suffer
> form over-stemming, i.e. mapping two non-related words to the same form).
>
> Regards
> Daniel
>
> --
> http://www.danielnaber.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
--
No one ever went blind looking at the bright side of life.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org