You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by roySolr <ro...@gmail.com> on 2011/05/03 13:48:20 UTC
Dismax scoring multiple fields TIE
Hello,
I have a question about scoring when i use the dismax handler. I will give
some examples:
name category related
category
1. Chelsea best club ever Chelsea Sport
2. Chelsea Chelsea
Sport
When i search for "Chelsea" i want a higher score for number 2. I think it
is a better match on fieldlength.
I use the dismax and both records has the same score. I see some difference
in fieldNorm both still the score is the same. How can i fix this?
my config:
<requestHandler name="dismax" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">dismax</str>
<str name="qf">
name category related_category
</str>
</lst>
<str name="tie">1.0</str>
</requestHandler>
SCORE 1:
0.75269306 = (MATCH) sum of:
0.75269306 = (MATCH) max of:
0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
0.3085193 = queryWeight(category:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
1.0 = tf(termFreq(category:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=category, doc=680)
0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
0.3085193 = queryWeight(name:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
1.0 = tf(termFreq(name:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.5 = fieldNorm(field=name, doc=680)
SCORE 2:
0.75269306 = (MATCH) sum of:
0.75269306 = (MATCH) max of:
0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
0.3085193 = queryWeight(category:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
1.0 = tf(termFreq(category:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=category, doc=678)
0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
0.3085193 = queryWeight(name:chelsea), product of:
2.4396951 = idf(docFreq=236, maxDocs=1000)
0.12645814 = queryNorm
2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
1.0 = tf(termFreq(name:chelsea)=1)
2.4396951 = idf(docFreq=236, maxDocs=1000)
1.0 = fieldNorm(field=name, doc=678)
--
View this message in context: http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dismax scoring multiple fields TIE
Posted by roySolr <ro...@gmail.com>.
No, but i think the difference between fieldlength is large and the score is
still the same.
Same score for this results(q=chelsea):
1. Chelsea is a very very big club in london, england Chelsea
Sport
2. Chelsea
Chelsea Sport
--
View this message in context: http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2894026.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dismax scoring multiple fields TIE
Posted by Erick Erickson <er...@gmail.com>.
I'm not sure you can. very short fields aren't differentiated on the basis
of field length due to rounding errors. Here's a cut-n-paste from Jay Hill:
********************************
So the values are not pre-set for the lengthNorm, but for some counts the
fieldLength value winds up being the same because of the precision los. Here
is a list of lengthNorm values for 1 to 10 term fields:
# of terms lengthNorm
1 1.0
2 .625
3 .5
4 .5
5 .4375
6 .375
7 .375
8 .3125
9 .3125
10 .3125
****************
I'd ask, though, if this behavior is "good enough", are your users well
served by spending time on this case?
Best
Erick
On Tue, May 3, 2011 at 7:48 AM, roySolr <ro...@gmail.com> wrote:
> Hello,
>
> I have a question about scoring when i use the dismax handler. I will give
> some examples:
>
> name category related
> category
> 1. Chelsea best club ever Chelsea Sport
> 2. Chelsea Chelsea
> Sport
>
> When i search for "Chelsea" i want a higher score for number 2. I think it
> is a better match on fieldlength.
> I use the dismax and both records has the same score. I see some difference
> in fieldNorm both still the score is the same. How can i fix this?
>
>
> my config:
>
> <requestHandler name="dismax" class="solr.SearchHandler" default="true">
> <lst name="defaults">
> <str name="defType">dismax</str>
> <str name="qf">
> name category related_category
> </str>
> </lst>
> <str name="tie">1.0</str>
> </requestHandler>
>
> SCORE 1:
> 0.75269306 = (MATCH) sum of:
> 0.75269306 = (MATCH) max of:
> 0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
> 0.3085193 = queryWeight(category:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
> 1.0 = tf(termFreq(category:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=category, doc=680)
> 0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
> 0.3085193 = queryWeight(name:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
> 1.0 = tf(termFreq(name:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.5 = fieldNorm(field=name, doc=680)
>
>
> SCORE 2:
> 0.75269306 = (MATCH) sum of:
> 0.75269306 = (MATCH) max of:
> 0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
> 0.3085193 = queryWeight(category:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
> 1.0 = tf(termFreq(category:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=category, doc=678)
> 0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
> 0.3085193 = queryWeight(name:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
> 1.0 = tf(termFreq(name:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=name, doc=678)
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Re: Dismax scoring multiple fields TIE
Posted by elisabeth benoit <el...@gmail.com>.
for category:chelsea, you have a fieldNorm=1.0, so your category field must
have a type with omitNorms=true. if you don't have omitNorms=true, then
shorter field will score higher.
I'm new to Solr, but from what I've experienced, this is the cause.
Regards,
Elisabeth
2011/5/3 roySolr <ro...@gmail.com>
> Hello,
>
> I have a question about scoring when i use the dismax handler. I will give
> some examples:
>
> name category related
> category
> 1. Chelsea best club ever Chelsea Sport
> 2. Chelsea Chelsea
> Sport
>
> When i search for "Chelsea" i want a higher score for number 2. I think it
> is a better match on fieldlength.
> I use the dismax and both records has the same score. I see some difference
> in fieldNorm both still the score is the same. How can i fix this?
>
>
> my config:
>
> <requestHandler name="dismax" class="solr.SearchHandler" default="true">
> <lst name="defaults">
> <str name="defType">dismax</str>
> <str name="qf">
> name category related_category
> </str>
> </lst>
> <str name="tie">1.0</str>
> </requestHandler>
>
> SCORE 1:
> 0.75269306 = (MATCH) sum of:
> 0.75269306 = (MATCH) max of:
> 0.75269306 = (MATCH) weight(category:chelsea in 680), product of:
> 0.3085193 = queryWeight(category:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(category:chelsea in 680), product of:
> 1.0 = tf(termFreq(category:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=category, doc=680)
> 0.37634653 = (MATCH) weight(name:chelsea in 680), product of:
> 0.3085193 = queryWeight(name:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 1.2198476 = (MATCH) fieldWeight(name:chelsea in 680), product of:
> 1.0 = tf(termFreq(name:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.5 = fieldNorm(field=name, doc=680)
>
>
> SCORE 2:
> 0.75269306 = (MATCH) sum of:
> 0.75269306 = (MATCH) max of:
> 0.75269306 = (MATCH) weight(category:chelsea in 678), product of:
> 0.3085193 = queryWeight(category:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(category:chelsea in 678), product of:
> 1.0 = tf(termFreq(category:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=category, doc=678)
> 0.75269306 = (MATCH) weight(name:chelsea in 678), product of:
> 0.3085193 = queryWeight(name:chelsea), product of:
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 0.12645814 = queryNorm
> 2.4396951 = (MATCH) fieldWeight(name:chelsea in 678), product of:
> 1.0 = tf(termFreq(name:chelsea)=1)
> 2.4396951 = idf(docFreq=236, maxDocs=1000)
> 1.0 = fieldNorm(field=name, doc=678)
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Dismax-scoring-multiple-fields-TIE-tp2893923p2893923.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>