You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Thomas Michael Engelke <th...@posteo.de> on 2017/06/30 14:48:51 UTC

Same score for different length matches

 Hey,

we have multiple documents that are matches for the query in question
("name:hubwagen"). Thing is, some of the documents only contain the
query, while others match 100% in the "name" field:

<result name="response" numFound="109" start="0" maxScore="5.9861565">
 <doc>
 <str name="name">Hochhubwagen</str>
 <float name="score">5.9861565</float></doc>
 <doc>
 <str name="name">Hubwagen</str>
 <float name="score">5.9861565</float></doc>
</result>

The debug looks like this (for the first and 5th match):

<lst name="debug">
 <lst name="queryBoosting">
 <str name="q">namhubwagnamehubwag</str>
 <null name="match"/>
 </lst>
 <str name="rawquerystring">name:Hubwagen</str>
 <str name="querystring">name:Hubwagen</str>
 <str name="parsedquery">name:hubwag</str>
 <str name="parsedquery_toString">name:hubwag</str>
 <lst name="explain">
 <str name="167">
5.9861565 = (MATCH) weight(name:hubwag in 8093) [DefaultSimilarity],
result of:
 5.9861565 = fieldWeight in 8093, product of:
 1.0 = tf(freq=1.0), with freq of:
 1.0 = termFreq=1.0
 5.9861565 = idf(docFreq=109, maxDocs=16101)
 1.0 = fieldNorm(doc=8093)
</str>
 <str name="1740">
5.9861565 = (MATCH) weight(name:hubwag in 9537) [DefaultSimilarity],
result of:
 5.9861565 = fieldWeight in 9537, product of:
 1.0 = tf(freq=1.0), with freq of:
 1.0 = termFreq=1.0
 5.9861565 = idf(docFreq=109, maxDocs=16101)
 1.0 = fieldNorm(doc=9537)
</str>

Now, I am decently certain that at one point in time it worked in a way
that a higher match length would rank higher. As far as I can read in
the SolrRelevancyFAQ, the correct term is "lengthNorm". However, I a
missing a preference for the full match.

Usually, the debug helps me identify mistakes, but in this case, the
debug only tells me that the scores are perfectly equal, down to the
lowest level. 

Re: Same score for different length matches

Posted by "alessandro.benedetti" <a....@sease.io>.
In addition to what Chris has correctly suggested, I would like to focus on
this sentence :
"  I am decently certain that at one point in time it worked in a way 
that a higher match length would rank higher"

You mean a match in a longer field would rank higher than a match in a
shorter field ?
is that what you want ( because it is counter intuitive) ?

Furthermore I see that some stemming is applied at query time , is that what
you want ?




-----
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: http://lucene.472066.n3.nabble.com/Same-score-for-different-length-matches-tp4343660p4343917.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Same score for different length matches

Posted by Chris Hostetter <ho...@fucit.org>.
: we have multiple documents that are matches for the query in question
: ("name:hubwagen"). Thing is, some of the documents only contain the
: query, while others match 100% in the "name" field:
	...
: 5.9861565 = (MATCH) weight(name:hubwag in 8093) [DefaultSimilarity],
: result of:
:  5.9861565 = fieldWeight in 8093, product of:
:  1.0 = tf(freq=1.0), with freq of:
:  1.0 = termFreq=1.0
:  5.9861565 = idf(docFreq=109, maxDocs=16101)
:  1.0 = fieldNorm(doc=8093)
	...
: 5.9861565 = (MATCH) weight(name:hubwag in 9537) [DefaultSimilarity],
: result of:
:  5.9861565 = fieldWeight in 9537, product of:
:  1.0 = tf(freq=1.0), with freq of:
:  1.0 = termFreq=1.0
:  5.9861565 = idf(docFreq=109, maxDocs=16101)
:  1.0 = fieldNorm(doc=9537)
	...
: that a higher match length would rank higher. As far as I can read in
: the SolrRelevancyFAQ, the correct term is "lengthNorm". However, I a
: missing a preference for the full match.

lengthNorm is a Similarity concept that rolls into the "fieldNorm" at 
index time.

According to your score explanations, the fieldNorm is 1.0 for both docs, 
suggestion that you have norms disabled -- see the omitNorms option on the 
fieldType for your "name" field.  


-Hoss
http://www.lucidworks.com/