You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bizu de Anúncio <at...@bizudeanuncio.com> on 2001/12/03 13:21:35 UTC

Filter and stop-words

	I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".

	My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?

	Any help would be appreciated,

	thanks a lot,

		jk


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


RE: Filter and stop-words

Posted by Karl Øie <ka...@gan.no>.
to remove plural form you have to create a stemmer for your language, i have
been working with porting a stemmer for norwegian for lucene, to get a head
start i have ported the norwegian snowball stemmer, there is one for
portuguese as well, check it out!

http://snowball.sourceforge.net/portuguese/stemmer.html

mvh karl øie


-----Original Message-----
From: Bizu de Anúncio [mailto:atendimento@bizudeanuncio.com]
Sent: 3. desember 2001 13:22
To: lucene-user@jakarta.apache.org
Subject: Filter and stop-words


	I'm new to Lucene. First of all I would like to know if there is a search
arquive like "sun servlets list".

	My first problem is that I want to index a Portuguese database and I need
to remove the "s" (plural) and acents (à é ...) from the words. Is there a
way of passing a filter class to the Lucene indexer ? And about the
stop-words, where should I configure Lucene to ignore it ?

	Any help would be appreciated,

	thanks a lot,

		jk


--
To unsubscribe, e-mail:
<ma...@jakarta.apache.org>
For additional commands, e-mail:
<ma...@jakarta.apache.org>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>