You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Aaron Daubman <da...@gmail.com> on 2014/07/02 05:14:29 UTC
Understanding fieldNorm differences between 3.6.1 and 4.9 solrs
In trying to determine some subtle scoring differences (causing
occasionally significant ordering differences) among search results, I
wrote a parser to normalize debug.explain.structured JSON output.
It appears that every score that is different comes down to a difference in
fieldNorm, where the 3.6.1 solr is using 0.109375 as the fieldNorm, and
the 4.9 solr is using 0.125 as the fieldNorm. [1]
What would be causing the different versions to use different field norms
(and rather infrequently, as the majority of scores are identical as
desired)?
Thanks,
Aaron
[1] Here's a snippet of the diff (of the output from my
debug.explain.structured normalizer) for one such difference (apologies for
the width):
"06808040cd523a296abaf26025148c85": {
"06808040cd523a296abaf26025148c85": {
* "_value": 0.83961660000000005, |
"_value": 0.85474813000000005, *
"description": "product of:",
"description": "product of:",
"details": [
"details": [
{ {
* "_value": 2.623802, |
"_value": 2.6710880000000001, *
"description": "sum of:",
"description": "sum of:",
"details": [
"details": [
{
{
* "_value": 0.064461969999999993, |
"_value": 0.073670830000000007, *
"description": "weight(t_style:alternative
"description": "weight(t_style:alternative
"details": [
"details": [
{
{
"_value": 0.062980229999999998,
"_value": 0.062980229999999998,
"description": "queryWeight",
"description": "queryWeight",
"details": [
"details": [
{
{
"_value": 4.1850079999999998,
"_value": 4.1850079999999998,
"description": "idf(137871)"
"description": "idf(137871)"
}
}
]
]
},
},
{
{
* "_value": 1.0235270999999999, |
"_value": 1.1697453, *
"description": "fieldWeight",
"description": "fieldWeight",
"details": [
"details": [
{
{
"_value": 2.2360679999999999,
"_value": 2.2360679999999999,
"description": "tf(freq=5)"
"description": "tf(freq=5)"
},
},
{
{
"_value": 4.1850079999999998,
"_value": 4.1850079999999998,
"description": "idf(137871)"
"description": "idf(137871)"
},
},
{
{
* "_value": 0.109375, |
"_value": 0.125, *
* "description": "fieldNorm"
"description": "fieldNorm"*
}
}
]
]
}
}
]
]
},
},
Re: Understanding fieldNorm differences between 3.6.1 and 4.9 solrs
Posted by Aaron Daubman <da...@gmail.com>.
Wow - so apparently I have terrible recall and should re-read this thread I
started on the same topic when upgrading from 1.4 to 3.6 and hit a very
similar fieldNorm issue almost two years ago! =)
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201207.mbox/%3CCALyTvnpwZMj4zxPbK0abVpnyRJny=QAuiJdqmj7E3ZgNv7Utpg@mail.gmail.com%3E
In the mean time, I'm still happy to hear any new thoughts / suggestions on
making similarity contiguous across upgrades.
Thanks again,
Aaron
On Tue, Jul 1, 2014 at 11:14 PM, Aaron Daubman <da...@gmail.com> wrote:
> In trying to determine some subtle scoring differences (causing
> occasionally significant ordering differences) among search results, I
> wrote a parser to normalize debug.explain.structured JSON output.
>
> It appears that every score that is different comes down to a difference
> in fieldNorm, where the 3.6.1 solr is using 0.109375 as the fieldNorm, and
> the 4.9 solr is using 0.125 as the fieldNorm. [1]
>
> What would be causing the different versions to use different field norms
> (and rather infrequently, as the majority of scores are identical as
> desired)?
>
> Thanks,
> Aaron
>
> [1] Here's a snippet of the diff (of the output from my
> debug.explain.structured normalizer) for one such difference (apologies for
> the width):
>
> "06808040cd523a296abaf26025148c85": {
> "06808040cd523a296abaf26025148c85": {
> * "_value": 0.83961660000000005, |
> "_value": 0.85474813000000005, *
> "description": "product of:",
> "description": "product of:",
> "details": [
> "details": [
> { {
> * "_value": 2.623802, |
> "_value": 2.6710880000000001, *
> "description": "sum of:",
> "description": "sum of:",
> "details": [
> "details": [
> {
> {
> * "_value": 0.064461969999999993, |
> "_value": 0.073670830000000007, *
> "description": "weight(t_style:alternative
> "description": "weight(t_style:alternative
> "details": [
> "details": [
> {
> {
> "_value": 0.062980229999999998,
> "_value": 0.062980229999999998,
> "description": "queryWeight",
> "description": "queryWeight",
> "details": [
> "details": [
> {
> {
> "_value": 4.1850079999999998,
> "_value": 4.1850079999999998,
> "description": "idf(137871)"
> "description": "idf(137871)"
> }
> }
> ]
> ]
> },
> },
> {
> {
> * "_value": 1.0235270999999999, |
> "_value": 1.1697453, *
> "description": "fieldWeight",
> "description": "fieldWeight",
> "details": [
> "details": [
> {
> {
> "_value": 2.2360679999999999,
> "_value": 2.2360679999999999,
> "description": "tf(freq=5)"
> "description": "tf(freq=5)"
> },
> },
> {
> {
> "_value": 4.1850079999999998,
> "_value": 4.1850079999999998,
> "description": "idf(137871)"
> "description": "idf(137871)"
> },
> },
> {
> {
> * "_value": 0.109375, |
> "_value": 0.125, *
> * "description": "fieldNorm"
> "description": "fieldNorm"*
> }
> }
> ]
> ]
> }
> }
> ]
> ]
> },
> },
>