You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pratik Patel <pr...@semandex.net> on 2018/08/23 15:53:02 UTC
Question on query time boosting
Hello All,
I am trying to understand how exactly query time boosting works in solr.
Primarily, I want to understand if absolute boost values matter or is it
just the relative difference between various boost values which decides
scoring. Let's take following two queries for example.
// case1: q parameter
> concept_name:(*semantic*)^200 OR
> concept_name:(*machine*)^400 OR
> Abstract_note:(*semantic*)^20 OR
> Abstract_note:(*machine*)^40
//case2: q parameter
> concept_name:(*semantic*)^20 OR
> concept_name:(*machine*)^40 OR
> Abstract_note:(*semantic*)^2 OR
> Abstract_note:(*machine*)^4
Are these two queries any different?
Relative boosting is same in both of them.
I can see that they produce same results and ordering. Only difference is
that the score in case1 is 10 times the score in case2.
Thanks,
Pratik
Re: Question on query time boosting
Posted by Kydryavtsev Andrey <we...@yandex.ru>.
Hi, Pratic
I believe that your observations are correct.
Score for each individual query (in your example it's wildcards query like 'concept_name:(*semantic*)^200') is calculated by a complex formulas (one of possible implementations with a good explanation is described here https://lucene.apache.org/core/7_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html), but it could be simplified as follows:
score(doc, query) = query_boost * <some other values depending on document and query>
Score for full disjunction (by default) would be calculated as a sum of every individual query matched.
So score of case1 would be:
score_for_case1(doc, query) = 200 * <some other values for query 1> + 400 * <some other values for query 2> + 20 * <some other values for query 3> + 40 * <some other values for query 4> = 10 * (20 * <some other values for query 1> + 40 * <some other values for query 2> + 2 * <some other values for query 3> + 4 * <some other values for query 4>) = 10 * score_for_case2(doc, query)
Thank you,
Andrey Kudryavtsev
23.08.2018, 18:53, "Pratik Patel" <pr...@semandex.net>:
> Hello All,
>
> I am trying to understand how exactly query time boosting works in solr.
> Primarily, I want to understand if absolute boost values matter or is it
> just the relative difference between various boost values which decides
> scoring. Let's take following two queries for example.
>
> // case1: q parameter
>
>> concept_name:(*semantic*)^200 OR
>> concept_name:(*machine*)^400 OR
>> Abstract_note:(*semantic*)^20 OR
>> Abstract_note:(*machine*)^40
>
> //case2: q parameter
>
>> concept_name:(*semantic*)^20 OR
>> concept_name:(*machine*)^40 OR
>> Abstract_note:(*semantic*)^2 OR
>> Abstract_note:(*machine*)^4
>
> Are these two queries any different?
>
> Relative boosting is same in both of them.
> I can see that they produce same results and ordering. Only difference is
> that the score in case1 is 10 times the score in case2.
>
> Thanks,
> Pratik