You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2004/07/01 18:13:43 UTC

Re: question on setting boost factor

On Jun 22, 2004, at 7:30 AM, Anson Lau wrote:

> Hi guys,
>
> Lets say I want to search the term "hello world" over 3 fields with
> different boost:
>
> ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100
> (hello:field3 world:field3)^20000))
>
> Note I've given field1 a really low boost, a heavy boost to field2 and  
> a
> REALLY heavy boost to field3.
>
> What is happening to me is that a term that matches both field1 and  
> field2,
> will have a higher score than a term that matches field3 only, even  
> though
> field3's boost is WAY higher.
>
> Can I change this behaviour such that the match in field3 only will  
> actually
> have a higher score because of the boost?

First step is to get familiar with the actual factors coming out in the  
IndexSearcher.explain() output (just System.out.println the Explanation  
object).  The coord() factor -  
<http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ 
Similarity.html#coord(int,%20int)> - is what you'll want to tweak to  
change how scores are affected when multiple terms match by creating  
your own DefaultSimilarity sublass (and probably just returning 1.0).   
Read the javadocs for Similarity to see how to hook in your own  
implementation (see also section).

	Erik

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: question on setting boost factor

Posted by Steven Rowe <sa...@syr.edu>.

Repaired URL (was extra space before "Similarity.html"):
<http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#coord(int,%20int)>

Corresponding Tiny URL:
<URL:http://tinyurl.com/3bo8y>

Erik Hatcher wrote:
> On Jun 22, 2004, at 7:30 AM, Anson Lau wrote:
>> Hi guys,
>>
>> Lets say I want to search the term "hello world" over 3 fields with
>> different boost:
>>
>> ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100
>> (hello:field3 world:field3)^20000))
>>
>> Note I've given field1 a really low boost, a heavy boost to field2 and  a
>> REALLY heavy boost to field3.
>>
>> What is happening to me is that a term that matches both field1 and  
>> field2,
>> will have a higher score than a term that matches field3 only, even  
>> though
>> field3's boost is WAY higher.
>>
>> Can I change this behaviour such that the match in field3 only will  
>> actually
>> have a higher score because of the boost?
> 
> 
> First step is to get familiar with the actual factors coming out in the  
> IndexSearcher.explain() output (just System.out.println the Explanation  
> object).  The coord() factor -  
> <http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ 
> Similarity.html#coord(int,%20int)> - is what you'll want to tweak to  
> change how scores are affected when multiple terms match by creating  
> your own DefaultSimilarity sublass (and probably just returning 1.0).   
> Read the javadocs for Similarity to see how to hook in your own  
> implementation (see also section).
> 
>     Erik
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org