You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Aïcha <ai...@yahoo.com> on 2007/01/29 09:15:12 UTC

Re : exact matches and stemming

Thank you for your reply Alvaro,

I am trying to index both stemmed word and the original, hope it works....

Thanks,
Aïcha


----- Message d'origine ----
De : Alvaro Cabrerizo <to...@gmail.com>
À : nutch-user@lucene.apache.org
Envoyé le : Vendredi, 26 Janvier 2007, 9h10mn 06s
Objet : Re: exact matches and stemming


Maybe you could store in your index both the stemmed word and the original
one. Although it will increment the size of your index.
Another posibllity could be to develop a WildcardQuery plugin or a
FuzzyQuery plugin, because lucene comes with this capabilities, and  avoid
stemming task. But it is known that wildcard a fuzzy have poor performance.

Hope it helps.

2007/1/24, Aïcha <ai...@yahoo.com>:
>
> Hello,
>
> I want to use the FrenchAnalyzer for stop word and stemming treatment
> but I want to still be able to do exact search, the problem is that the
> FrenchAnalyzer remove characters from the terms when the indexing is made so
> it isn't possible to have only exact matches from an index indexed with the
> FrenchAnalyzer....
>
> could someone help me,
> thanks in advance
> Aïcha
>
>
>
>
>
>
>
> ___________________________________________________________________________
> Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions
> !
> Profitez des connaissances, des opinions et des expériences des
> internautes sur Yahoo! Questions/Réponses
> http://fr.answers.yahoo.com
>


	

	
		
___________________________________________________________________________ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses 
http://fr.answers.yahoo.com

Wildcards

Posted by Michael Levy <Lu...@gmail.com>.
I would very much like to be able to use wildcards in Nutch for some 
intranet applications I'm working on.  I've read some postings that 
don't seem very promising for me.  Any news or tips?