You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Thomas Klein <tk...@laposte.net> on 2006/11/18 10:19:48 UTC
stemmer
Hi there,
I'm fairly new to lucene, I just developped a multi threaded indexing
tcp server using lucene to hmmm, let me remember, index stuffs :)
I have to index not only english, but french and german, and, I don't
know, perhaps other languages in the future.
Did lucene use a default stemmer or do I have to stem the texts before
indexing ?
Does a multi-language stemmer exists ?
(sorry if the answers are in the documentation, I didn't manage to
fully read it)
Thanks in advance !
Thomas Klein.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: stemmer
Posted by Erick Erickson <er...@gmail.com>.
Thomas:
There are some rather extensive threads on this list about the "interesting"
issues that exist when indexing/searching other languages. I think you'd
find it worthwhile to search the list archive for foreign language or some
such...
The short answer as I remember is that there *is* a built-in stemmer, but
whether it does what you want when indexing multiple languages depends upon
what results you expect to get...and there's no clear answer that I
remember....
Erick
On 11/18/06, Thomas Klein <tk...@laposte.net> wrote:
>
> Hi there,
>
> I'm fairly new to lucene, I just developped a multi threaded indexing
> tcp server using lucene to hmmm, let me remember, index stuffs :)
>
> I have to index not only english, but french and german, and, I don't
> know, perhaps other languages in the future.
>
> Did lucene use a default stemmer or do I have to stem the texts before
> indexing ?
>
> Does a multi-language stemmer exists ?
>
> (sorry if the answers are in the documentation, I didn't manage to
> fully read it)
>
> Thanks in advance !
>
> Thomas Klein.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>