You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kerwin <ke...@gmail.com> on 2012/04/11 16:19:58 UTC

Boost differences in two environments for same query and config

Hi All,

I am firing the following Solr query against installations on two
environments one on my local Windows machine and the other on Unix
(Remote).

RECORD_TYPE:info AND (NAME:ee123* OR CD:ee123^1000 OR CD:ee123*^100)

There are no differences in the DataImportHandler configuration ,
Schema and Solrconfig for both these installations.
The correct expected result is given by the local installation of Solr
which also gives scores as expected for the boosts.

CORRECT/Expected:
Debug query output for local installation:

10.822258 = (MATCH) sum of:
	0.002170282 = (MATCH) weight(RECORD_TYPE:info in 35916), product of:
		3.65739E-4 = queryWeight(RECORD_TYPE:info), product of:
			5.933964 = idf(docFreq=58891, maxDocs=8181811)
			6.1634855E-5 = queryNorm
		5.933964 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916), product of:
			1.0 = tf(termFreq(RECORD_TYPE:info)=1)
			5.933964 = idf(docFreq=58891, maxDocs=8181811)
			1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
	10.820087 = (MATCH) product of:
		16.230131 = (MATCH) sum of:
			16.223969 = (MATCH) weight(CD:ee123^1000.0 in 35916), product of:
				0.999981 = queryWeight(CD:ee123^1000.0), product of:
					1000.0 = boost
					16.224277 = idf(docFreq=1, maxDocs=8181811)
					6.1634855E-5 = queryNorm
				16.224277 = (MATCH) fieldWeight(CD:ee123 in 35916), product of:
					1.0 = tf(termFreq(CD:ee123)=1)
					16.224277 = idf(docFreq=1, maxDocs=8181811)
					1.0 = fieldNorm(field=CD, doc=35916)
				0.0061634853 = (MATCH)
ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
					100.0 = boost
					6.1634855E-5 = queryNorm
		0.6666667 = coord(2/3)

INCORRECT/Unexpected:
Debug query output for Unix installation (Remote):

9.950362E-4 = (MATCH) sum of:
	9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35948), product of:
		9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
			1.0 = idf(docFreq=58891, maxDocs=8181811)
			9.950362E-4 = queryNorm
		1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35948), product of:
			1.0 = tf(termFreq(RECORD_TYPE:info)=1)
			1.0 = idf(docFreq=58891, maxDocs=8181811)
			1.0 = fieldNorm(field=RECORD_TYPE, doc=35948)
	0.0 = (MATCH) product of:
		1.0945399 = (MATCH) sum of:
			0.99503624 = (MATCH) weight(CD:ee123^1000.0 in 35948), product of:
				0.99503624 = queryWeight(CD:ee123^1000.0), product of:
					1000.0 = boost
					1.0 = idf(docFreq=1, maxDocs=8181811)
					9.950362E-4 = queryNorm
				1.0 = (MATCH) fieldWeight(CD:ee123 in 35948), product of:
					1.0 = tf(termFreq(CD:ee123)=1)
					1.0 = idf(docFreq=1, maxDocs=8181811)
					1.0 = fieldNorm(field=CD, doc=35948)
				0.09950362 = (MATCH)
ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
					100.0 = boost
					9.950362E-4 = queryNorm
		0.0 = coord(2/3)


As seen from the above the scoring is different in the two outputs.
Also in the second output the evaluations do not seem to be correct
like 0.0 = coord(2/3) and the sums are incorrect.
I am using Solr Implementation Version: 4.0-dev 985987M

Could you please let me know what the issue is if there is any? What
should I check?
Appreciate your help.

Re: Boost differences in two environments for same query and config

Posted by Erick Erickson <er...@gmail.com>.
Well, next thing I'd do is just copy your entire <solr home>
directory to the remote machine and try that. If that gives
identical results on both, then try moving just your
<solr home>/data directory to the remote machine.

I suspect that you've done something different between the two
machines that's leading to this, but haven't a clue what.

If you copy your entire Solr installation over and _still_ get
this kind of thing, we're into whether the JVM or op system
are somehow changing things, which would surprise me a lot.

Best
Erick

On Fri, Apr 13, 2012 at 4:24 AM, Kerwin <ke...@gmail.com> wrote:
> Hi Erick,
>
> Thanks for your suggestions.
> I did an optimize on the remote installation and this time with the
> same number of documents but still face the same issue as seen from
> the debug output below:
>
> 9.950362E-4 = (MATCH) sum of:
>        9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35916), product of:
>                9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>                        9.950362E-4 = queryNorm
>                1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916), product of:
>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
>        0.0 = (MATCH) product of:
>                1.0945399 = (MATCH) sum of:
>                        0.99503624 = (MATCH) weight(CD:ee123^1000.0 in 35916), product of:
>                                0.99503624 = queryWeight(CD:ee123^1000.0), product of:
>                                        1000.0 = boost
>                                        1.0 = idf(docFreq=1, maxDocs=8181811)
>                                        9.950362E-4 = queryNorm
>                                1.0 = (MATCH) fieldWeight(CD:ee123 in 35916), product of:
>                                        1.0 = tf(termFreq(CD:ee123)=1)
>                                        1.0 = idf(docFreq=1, maxDocs=8181811)
>                                        1.0 = fieldNorm(field=CD, doc=35916)
>                                0.09950362 = (MATCH)
> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>                                        100.0 = boost
>                                        9.950362E-4 = queryNorm
>                0.0 = coord(2/3)
>
>
> So I got the conf folder from the remote server location and replaced
> my local conf folder with this one to see if the indexes were formed
> differently but my local installation continues to work.I would expect
> to see the same behaviour as on the remote installation but it did not
> happen. (The only difference on the remote installation is that there
> are cores while my local installation has no cores).
> Anything else I could try?
> Thanks for your help.
>
> On 4/11/12, Erick Erickson <er...@gmail.com> wrote:
>> Well, you're matching a different number of records, so I have to assume
>> your indexes are different on the two machines.
>>
>> Here is one case where doing an optimize might make sense, that'll purge
>> the data associated with any deleted records from the index which should
>> make comparisons better....
>>
>> Additionally, you have to insure that your request handler is identical
>> on both, have you made any changes to solrconfig.xml?
>>
>> About the coord (2/3), I'm pretty clueless. But also insure that your
>> parsed query is identical on both, which is an additional check on
>> whether you've changed something on one server and not the
>> other.
>>
>> Best
>> Erick
>>
>> On Wed, Apr 11, 2012 at 8:19 AM, Kerwin <ke...@gmail.com> wrote:
>>> Hi All,
>>>
>>> I am firing the following Solr query against installations on two
>>> environments one on my local Windows machine and the other on Unix
>>> (Remote).
>>>
>>> RECORD_TYPE:info AND (NAME:ee123* OR CD:ee123^1000 OR CD:ee123*^100)
>>>
>>> There are no differences in the DataImportHandler configuration ,
>>> Schema and Solrconfig for both these installations.
>>> The correct expected result is given by the local installation of Solr
>>> which also gives scores as expected for the boosts.
>>>
>>> CORRECT/Expected:
>>> Debug query output for local installation:
>>>
>>> 10.822258 = (MATCH) sum of:
>>>        0.002170282 = (MATCH) weight(RECORD_TYPE:info in 35916), product
>>> of:
>>>                3.65739E-4 = queryWeight(RECORD_TYPE:info), product of:
>>>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>>>                        6.1634855E-5 = queryNorm
>>>                5.933964 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916),
>>> product of:
>>>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>>>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>>>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
>>>        10.820087 = (MATCH) product of:
>>>                16.230131 = (MATCH) sum of:
>>>                        16.223969 = (MATCH) weight(CD:ee123^1000.0 in
>>> 35916), product of:
>>>                                0.999981 = queryWeight(CD:ee123^1000.0),
>>> product of:
>>>                                        1000.0 = boost
>>>                                        16.224277 = idf(docFreq=1,
>>> maxDocs=8181811)
>>>                                        6.1634855E-5 = queryNorm
>>>                                16.224277 = (MATCH) fieldWeight(CD:ee123 in
>>> 35916), product of:
>>>                                        1.0 = tf(termFreq(CD:ee123)=1)
>>>                                        16.224277 = idf(docFreq=1,
>>> maxDocs=8181811)
>>>                                        1.0 = fieldNorm(field=CD,
>>> doc=35916)
>>>                                0.0061634853 = (MATCH)
>>> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
>>> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
>>> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
>>> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
>>> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
>>> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
>>> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>>>                                        100.0 = boost
>>>                                        6.1634855E-5 = queryNorm
>>>                0.6666667 = coord(2/3)
>>>
>>> INCORRECT/Unexpected:
>>> Debug query output for Unix installation (Remote):
>>>
>>> 9.950362E-4 = (MATCH) sum of:
>>>        9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35948), product
>>> of:
>>>                9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
>>>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>>>                        9.950362E-4 = queryNorm
>>>                1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35948),
>>> product of:
>>>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>>>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>>>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35948)
>>>        0.0 = (MATCH) product of:
>>>                1.0945399 = (MATCH) sum of:
>>>                        0.99503624 = (MATCH) weight(CD:ee123^1000.0 in
>>> 35948), product of:
>>>                                0.99503624 = queryWeight(CD:ee123^1000.0),
>>> product of:
>>>                                        1000.0 = boost
>>>                                        1.0 = idf(docFreq=1,
>>> maxDocs=8181811)
>>>                                        9.950362E-4 = queryNorm
>>>                                1.0 = (MATCH) fieldWeight(CD:ee123 in
>>> 35948), product of:
>>>                                        1.0 = tf(termFreq(CD:ee123)=1)
>>>                                        1.0 = idf(docFreq=1,
>>> maxDocs=8181811)
>>>                                        1.0 = fieldNorm(field=CD,
>>> doc=35948)
>>>                                0.09950362 = (MATCH)
>>> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
>>> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
>>> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
>>> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
>>> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
>>> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
>>> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>>>                                        100.0 = boost
>>>                                        9.950362E-4 = queryNorm
>>>                0.0 = coord(2/3)
>>>
>>>
>>> As seen from the above the scoring is different in the two outputs.
>>> Also in the second output the evaluations do not seem to be correct
>>> like 0.0 = coord(2/3) and the sums are incorrect.
>>> I am using Solr Implementation Version: 4.0-dev 985987M
>>>
>>> Could you please let me know what the issue is if there is any? What
>>> should I check?
>>> Appreciate your help.
>>

Re: Boost differences in two environments for same query and config

Posted by Kerwin <ke...@gmail.com>.
Hi Erick,

Thanks for your suggestions.
I did an optimize on the remote installation and this time with the
same number of documents but still face the same issue as seen from
the debug output below:

9.950362E-4 = (MATCH) sum of:
	9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35916), product of:
		9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
			1.0 = idf(docFreq=58891, maxDocs=8181811)
			9.950362E-4 = queryNorm
		1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916), product of:
			1.0 = tf(termFreq(RECORD_TYPE:info)=1)
			1.0 = idf(docFreq=58891, maxDocs=8181811)
			1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
	0.0 = (MATCH) product of:
		1.0945399 = (MATCH) sum of:
			0.99503624 = (MATCH) weight(CD:ee123^1000.0 in 35916), product of:
				0.99503624 = queryWeight(CD:ee123^1000.0), product of:
					1000.0 = boost
					1.0 = idf(docFreq=1, maxDocs=8181811)
					9.950362E-4 = queryNorm
				1.0 = (MATCH) fieldWeight(CD:ee123 in 35916), product of:
					1.0 = tf(termFreq(CD:ee123)=1)
					1.0 = idf(docFreq=1, maxDocs=8181811)
					1.0 = fieldNorm(field=CD, doc=35916)
				0.09950362 = (MATCH)
ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
					100.0 = boost
					9.950362E-4 = queryNorm
		0.0 = coord(2/3)


So I got the conf folder from the remote server location and replaced
my local conf folder with this one to see if the indexes were formed
differently but my local installation continues to work.I would expect
to see the same behaviour as on the remote installation but it did not
happen. (The only difference on the remote installation is that there
are cores while my local installation has no cores).
Anything else I could try?
Thanks for your help.

On 4/11/12, Erick Erickson <er...@gmail.com> wrote:
> Well, you're matching a different number of records, so I have to assume
> your indexes are different on the two machines.
>
> Here is one case where doing an optimize might make sense, that'll purge
> the data associated with any deleted records from the index which should
> make comparisons better....
>
> Additionally, you have to insure that your request handler is identical
> on both, have you made any changes to solrconfig.xml?
>
> About the coord (2/3), I'm pretty clueless. But also insure that your
> parsed query is identical on both, which is an additional check on
> whether you've changed something on one server and not the
> other.
>
> Best
> Erick
>
> On Wed, Apr 11, 2012 at 8:19 AM, Kerwin <ke...@gmail.com> wrote:
>> Hi All,
>>
>> I am firing the following Solr query against installations on two
>> environments one on my local Windows machine and the other on Unix
>> (Remote).
>>
>> RECORD_TYPE:info AND (NAME:ee123* OR CD:ee123^1000 OR CD:ee123*^100)
>>
>> There are no differences in the DataImportHandler configuration ,
>> Schema and Solrconfig for both these installations.
>> The correct expected result is given by the local installation of Solr
>> which also gives scores as expected for the boosts.
>>
>> CORRECT/Expected:
>> Debug query output for local installation:
>>
>> 10.822258 = (MATCH) sum of:
>>        0.002170282 = (MATCH) weight(RECORD_TYPE:info in 35916), product
>> of:
>>                3.65739E-4 = queryWeight(RECORD_TYPE:info), product of:
>>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>>                        6.1634855E-5 = queryNorm
>>                5.933964 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916),
>> product of:
>>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
>>        10.820087 = (MATCH) product of:
>>                16.230131 = (MATCH) sum of:
>>                        16.223969 = (MATCH) weight(CD:ee123^1000.0 in
>> 35916), product of:
>>                                0.999981 = queryWeight(CD:ee123^1000.0),
>> product of:
>>                                        1000.0 = boost
>>                                        16.224277 = idf(docFreq=1,
>> maxDocs=8181811)
>>                                        6.1634855E-5 = queryNorm
>>                                16.224277 = (MATCH) fieldWeight(CD:ee123 in
>> 35916), product of:
>>                                        1.0 = tf(termFreq(CD:ee123)=1)
>>                                        16.224277 = idf(docFreq=1,
>> maxDocs=8181811)
>>                                        1.0 = fieldNorm(field=CD,
>> doc=35916)
>>                                0.0061634853 = (MATCH)
>> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
>> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
>> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
>> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
>> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
>> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
>> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>>                                        100.0 = boost
>>                                        6.1634855E-5 = queryNorm
>>                0.6666667 = coord(2/3)
>>
>> INCORRECT/Unexpected:
>> Debug query output for Unix installation (Remote):
>>
>> 9.950362E-4 = (MATCH) sum of:
>>        9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35948), product
>> of:
>>                9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
>>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>>                        9.950362E-4 = queryNorm
>>                1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35948),
>> product of:
>>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35948)
>>        0.0 = (MATCH) product of:
>>                1.0945399 = (MATCH) sum of:
>>                        0.99503624 = (MATCH) weight(CD:ee123^1000.0 in
>> 35948), product of:
>>                                0.99503624 = queryWeight(CD:ee123^1000.0),
>> product of:
>>                                        1000.0 = boost
>>                                        1.0 = idf(docFreq=1,
>> maxDocs=8181811)
>>                                        9.950362E-4 = queryNorm
>>                                1.0 = (MATCH) fieldWeight(CD:ee123 in
>> 35948), product of:
>>                                        1.0 = tf(termFreq(CD:ee123)=1)
>>                                        1.0 = idf(docFreq=1,
>> maxDocs=8181811)
>>                                        1.0 = fieldNorm(field=CD,
>> doc=35948)
>>                                0.09950362 = (MATCH)
>> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
>> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
>> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
>> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
>> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
>> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
>> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>>                                        100.0 = boost
>>                                        9.950362E-4 = queryNorm
>>                0.0 = coord(2/3)
>>
>>
>> As seen from the above the scoring is different in the two outputs.
>> Also in the second output the evaluations do not seem to be correct
>> like 0.0 = coord(2/3) and the sums are incorrect.
>> I am using Solr Implementation Version: 4.0-dev 985987M
>>
>> Could you please let me know what the issue is if there is any? What
>> should I check?
>> Appreciate your help.
>

Re: Boost differences in two environments for same query and config

Posted by Erick Erickson <er...@gmail.com>.
Well, you're matching a different number of records, so I have to assume
your indexes are different on the two machines.

Here is one case where doing an optimize might make sense, that'll purge
the data associated with any deleted records from the index which should
make comparisons better....

Additionally, you have to insure that your request handler is identical
on both, have you made any changes to solrconfig.xml?

About the coord (2/3), I'm pretty clueless. But also insure that your
parsed query is identical on both, which is an additional check on
whether you've changed something on one server and not the
other.

Best
Erick

On Wed, Apr 11, 2012 at 8:19 AM, Kerwin <ke...@gmail.com> wrote:
> Hi All,
>
> I am firing the following Solr query against installations on two
> environments one on my local Windows machine and the other on Unix
> (Remote).
>
> RECORD_TYPE:info AND (NAME:ee123* OR CD:ee123^1000 OR CD:ee123*^100)
>
> There are no differences in the DataImportHandler configuration ,
> Schema and Solrconfig for both these installations.
> The correct expected result is given by the local installation of Solr
> which also gives scores as expected for the boosts.
>
> CORRECT/Expected:
> Debug query output for local installation:
>
> 10.822258 = (MATCH) sum of:
>        0.002170282 = (MATCH) weight(RECORD_TYPE:info in 35916), product of:
>                3.65739E-4 = queryWeight(RECORD_TYPE:info), product of:
>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>                        6.1634855E-5 = queryNorm
>                5.933964 = (MATCH) fieldWeight(RECORD_TYPE:info in 35916), product of:
>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>                        5.933964 = idf(docFreq=58891, maxDocs=8181811)
>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35916)
>        10.820087 = (MATCH) product of:
>                16.230131 = (MATCH) sum of:
>                        16.223969 = (MATCH) weight(CD:ee123^1000.0 in 35916), product of:
>                                0.999981 = queryWeight(CD:ee123^1000.0), product of:
>                                        1000.0 = boost
>                                        16.224277 = idf(docFreq=1, maxDocs=8181811)
>                                        6.1634855E-5 = queryNorm
>                                16.224277 = (MATCH) fieldWeight(CD:ee123 in 35916), product of:
>                                        1.0 = tf(termFreq(CD:ee123)=1)
>                                        16.224277 = idf(docFreq=1, maxDocs=8181811)
>                                        1.0 = fieldNorm(field=CD, doc=35916)
>                                0.0061634853 = (MATCH)
> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>                                        100.0 = boost
>                                        6.1634855E-5 = queryNorm
>                0.6666667 = coord(2/3)
>
> INCORRECT/Unexpected:
> Debug query output for Unix installation (Remote):
>
> 9.950362E-4 = (MATCH) sum of:
>        9.950362E-4 = (MATCH) weight(RECORD_TYPE:info in 35948), product of:
>                9.950362E-4 = queryWeight(RECORD_TYPE:info), product of:
>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>                        9.950362E-4 = queryNorm
>                1.0 = (MATCH) fieldWeight(RECORD_TYPE:info in 35948), product of:
>                        1.0 = tf(termFreq(RECORD_TYPE:info)=1)
>                        1.0 = idf(docFreq=58891, maxDocs=8181811)
>                        1.0 = fieldNorm(field=RECORD_TYPE, doc=35948)
>        0.0 = (MATCH) product of:
>                1.0945399 = (MATCH) sum of:
>                        0.99503624 = (MATCH) weight(CD:ee123^1000.0 in 35948), product of:
>                                0.99503624 = queryWeight(CD:ee123^1000.0), product of:
>                                        1000.0 = boost
>                                        1.0 = idf(docFreq=1, maxDocs=8181811)
>                                        9.950362E-4 = queryNorm
>                                1.0 = (MATCH) fieldWeight(CD:ee123 in 35948), product of:
>                                        1.0 = tf(termFreq(CD:ee123)=1)
>                                        1.0 = idf(docFreq=1, maxDocs=8181811)
>                                        1.0 = fieldNorm(field=CD, doc=35948)
>                                0.09950362 = (MATCH)
> ConstantScoreQuery(QueryWrapperFilter(CD:ee123 CD:ee123c CD:ee123c.
> CD:ee123dc CD:ee123e CD:ee123e. CD:ee123en CD:ee123fx CD:ee123g
> CD:ee123g.1 CD:ee123g1 CD:ee123ee123 CD:ee123l.1 CD:ee123l1 CD:ee123ll
> CD:ee123lr CD:ee123m.z CD:ee123mg CD:ee123mz CD:ee123na CD:ee123nx
> CD:ee123ol CD:ee123op CD:ee123p CD:ee123p.1 CD:ee123p1 CD:ee123pn
> CD:ee123r.1 CD:ee123r1 CD:ee123s CD:ee123s.z CD:ee123sm CD:ee123sn
> CD:ee123sp CD:ee123ss CD:ee123sz)), product of:
>                                        100.0 = boost
>                                        9.950362E-4 = queryNorm
>                0.0 = coord(2/3)
>
>
> As seen from the above the scoring is different in the two outputs.
> Also in the second output the evaluations do not seem to be correct
> like 0.0 = coord(2/3) and the sums are incorrect.
> I am using Solr Implementation Version: 4.0-dev 985987M
>
> Could you please let me know what the issue is if there is any? What
> should I check?
> Appreciate your help.