You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pratik Patel <pr...@semandex.net> on 2018/08/23 15:53:02 UTC

Question on query time boosting

Hello All,

I am trying to understand how exactly query time boosting works in solr.
Primarily, I want to understand if absolute boost values matter or is it
just the relative difference between various boost values which decides
scoring. Let's take following two queries for example.

// case1: q parameter

> concept_name:(*semantic*)^200 OR
> concept_name:(*machine*)^400 OR
> Abstract_note:(*semantic*)^20 OR
> Abstract_note:(*machine*)^40


//case2: q parameter

> concept_name:(*semantic*)^20 OR
> concept_name:(*machine*)^40 OR
> Abstract_note:(*semantic*)^2 OR
> Abstract_note:(*machine*)^4


Are these two queries any different?

Relative boosting is same in both of them.
I can see that they produce same results and ordering. Only difference is
that the score in case1 is 10 times the score in case2.

Thanks,
Pratik

Re: Question on query time boosting

Posted by Kydryavtsev Andrey <we...@yandex.ru>.
Hi, Pratic 

I believe that your observations are correct. 

Score for each individual query (in your example it's wildcards query like 'concept_name:(*semantic*)^200') is calculated by a complex formulas (one of possible implementations with a good explanation is described here https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html), but it could be simplified as follows:

score(doc, query) = query_boost * <some other values depending on document and query> 

Score for full disjunction (by default) would be calculated as a sum of every individual query matched.

So score of case1 would be:

score_for_case1(doc, query) = 200 * <some other values for query 1> + 400 * <some other values for query 2> + 20 * <some other values for query 3> + 40 * <some other values for query 4> = 10 * (20 * <some other values for query 1> + 40 * <some other values for query 2> + 2 * <some other values for query 3> + 4 * <some other values for query 4>) = 10 * score_for_case2(doc, query)



Thank you,

Andrey Kudryavtsev

23.08.2018, 18:53, "Pratik Patel" <pr...@semandex.net>:
> Hello All,
>
> I am trying to understand how exactly query time boosting works in solr.
> Primarily, I want to understand if absolute boost values matter or is it
> just the relative difference between various boost values which decides
> scoring. Let's take following two queries for example.
>
> // case1: q parameter
>
>>  concept_name:(*semantic*)^200 OR
>>  concept_name:(*machine*)^400 OR
>>  Abstract_note:(*semantic*)^20 OR
>>  Abstract_note:(*machine*)^40
>
> //case2: q parameter
>
>>  concept_name:(*semantic*)^20 OR
>>  concept_name:(*machine*)^40 OR
>>  Abstract_note:(*semantic*)^2 OR
>>  Abstract_note:(*machine*)^4
>
> Are these two queries any different?
>
> Relative boosting is same in both of them.
> I can see that they produce same results and ordering. Only difference is
> that the score in case1 is 10 times the score in case2.
>
> Thanks,
> Pratik