You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Karthik N S <ka...@controlnet.co.in> on 2004/12/23 07:17:48 UTC

SNOWBALL STEMMER + BOOSTING


Hi Guys

Apologies..........

Using Analysis Paralysis on SnowBall Stemmer [ using StandardAnalyzer.
ENGLISH_STOP_WORDS
and StopAnalyzer.ENGLISH_STOP_WORDS ] from

http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html?page=last#thre
ad

for the word   'jakarta^4 apache'

both the cases return me something like this
=========================================
org.apache.lucene.analysis.snowball.SnowballAnalyzer:

[JAKARTHA] [4] [APACHE]

=========================================


I wonder what happened to the BOOSTING SYMBOL '^' and if the same word
is used on QueryParser.parse(), What would be the Hit's returned???

Thx in advance

WITH WARM REGARDS
HAVE A NICE DAY
[ N.S.KARTHIK]



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: SNOWBALL STEMMER + BOOSTING

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Dec 23, 2004, at 1:17 AM, Karthik N S wrote:
> Using Analysis Paralysis on SnowBall Stemmer [ using StandardAnalyzer.
> ENGLISH_STOP_WORDS
> and StopAnalyzer.ENGLISH_STOP_WORDS ] from
>
> http://today.java.net/pub/a/today/2003/07/30/LuceneIntro.html? 
> page=last#thre
> ad
>
> for the word   'jakarta^4 apache'
>
> both the cases return me something like this
> =========================================
> org.apache.lucene.analysis.snowball.SnowballAnalyzer:
>
> [JAKARTHA] [4] [APACHE]
>
> =========================================
>
>
> I wonder what happened to the BOOSTING SYMBOL '^' and if the same word
> is used on QueryParser.parse()

Analyzing a query expression outside of QueryParser is _not_ doing the  
same thing that QueryParser does.  QueryParser picks out the pieces it  
knows about (parenthesis, boost symbol, AND, OR, etc, etc) and only  
analyzes term text.  In your example it would analyze "jakarta" and  
"apache" separately.

> , What would be the Hit's returned???

That all depends on what you indexed and what analyzer you used at  
index time :)

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org