You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Cameron VandenBerg (Jira)" <ji...@apache.org> on 2021/04/06 13:09:00 UTC

[jira] [Created] (SOLR-15319) ExactStatsCache not always producing Distributed IDF

Cameron VandenBerg created SOLR-15319:
-----------------------------------------

             Summary: ExactStatsCache not always producing Distributed IDF
                 Key: SOLR-15319
                 URL: https://issues.apache.org/jira/browse/SOLR-15319
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Cameron VandenBerg


I want a Distributed IDF across all parts of the collection so I have added this line to my solrconfig.xml:
{color:#000080}<{color}{color:#000080}statsCache{color}{color:#000080} {color}{color:#008080}class{color}{color:#000080}={color}{color:#dd1144}"org.apache.solr.search.stats.ExactStatsCache"{color}{color:#000080} />{color}
 
This seems to work about 90% of the time, but if I run the same request over and over again, sometimes I get scores with a local IDF for just one part of the collection.  Here is a request example:
/solr/collection1,collection2/query?q=fulltext:shark&rows=500&fl=id,url,title,score&sort=score+desc
 
I still get documents from both collection1 and collection2, but sometimes I get scores that are the same as when I would just query collection1.  I believe that it is only using the document frequency of collection one for the term in that case.
 
It looks like this issue is specifically related to multi-collection
requests (i.e., I don't observe this issue for a request against a single
collection). Checking `docCount` in the score "explain" (with
`debug=true`), it looks like multi-collection requests pick one collection
or the other (apparently non-deterministically?) when retrieving
distributed `docCount` for idf calculation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org