You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2004/07/01 18:13:43 UTC
Re: question on setting boost factor
On Jun 22, 2004, at 7:30 AM, Anson Lau wrote:
> Hi guys,
>
> Lets say I want to search the term "hello world" over 3 fields with
> different boost:
>
> ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100
> (hello:field3 world:field3)^20000))
>
> Note I've given field1 a really low boost, a heavy boost to field2 and
> a
> REALLY heavy boost to field3.
>
> What is happening to me is that a term that matches both field1 and
> field2,
> will have a higher score than a term that matches field3 only, even
> though
> field3's boost is WAY higher.
>
> Can I change this behaviour such that the match in field3 only will
> actually
> have a higher score because of the boost?
First step is to get familiar with the actual factors coming out in the
IndexSearcher.explain() output (just System.out.println the Explanation
object). The coord() factor -
<http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/
Similarity.html#coord(int,%20int)> - is what you'll want to tweak to
change how scores are affected when multiple terms match by creating
your own DefaultSimilarity sublass (and probably just returning 1.0).
Read the javadocs for Similarity to see how to hook in your own
implementation (see also section).
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: question on setting boost factor
Posted by Steven Rowe <sa...@syr.edu>.
Repaired URL (was extra space before "Similarity.html"):
<http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#coord(int,%20int)>
Corresponding Tiny URL:
<URL:http://tinyurl.com/3bo8y>
Erik Hatcher wrote:
> On Jun 22, 2004, at 7:30 AM, Anson Lau wrote:
>> Hi guys,
>>
>> Lets say I want to search the term "hello world" over 3 fields with
>> different boost:
>>
>> ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100
>> (hello:field3 world:field3)^20000))
>>
>> Note I've given field1 a really low boost, a heavy boost to field2 and a
>> REALLY heavy boost to field3.
>>
>> What is happening to me is that a term that matches both field1 and
>> field2,
>> will have a higher score than a term that matches field3 only, even
>> though
>> field3's boost is WAY higher.
>>
>> Can I change this behaviour such that the match in field3 only will
>> actually
>> have a higher score because of the boost?
>
>
> First step is to get familiar with the actual factors coming out in the
> IndexSearcher.explain() output (just System.out.println the Explanation
> object). The coord() factor -
> <http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/
> Similarity.html#coord(int,%20int)> - is what you'll want to tweak to
> change how scores are affected when multiple terms match by creating
> your own DefaultSimilarity sublass (and probably just returning 1.0).
> Read the javadocs for Similarity to see how to hook in your own
> implementation (see also section).
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org