You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bizu de Anúncio <at...@bizudeanuncio.com> on 2001/12/03 13:21:35 UTC
Filter and stop-words
I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".
My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?
Any help would be appreciated,
thanks a lot,
jk
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>
RE: Filter and stop-words
Posted by Karl Øie <ka...@gan.no>.
to remove plural form you have to create a stemmer for your language, i have
been working with porting a stemmer for norwegian for lucene, to get a head
start i have ported the norwegian snowball stemmer, there is one for
portuguese as well, check it out!
http://snowball.sourceforge.net/portuguese/stemmer.html
mvh karl øie
-----Original Message-----
From: Bizu de Anúncio [mailto:atendimento@bizudeanuncio.com]
Sent: 3. desember 2001 13:22
To: lucene-user@jakarta.apache.org
Subject: Filter and stop-words
I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".
My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?
Any help would be appreciated,
thanks a lot,
jk
--
To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
For additional commands, e-mail:
<ma...@jakarta.apache.org>
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>