You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/03/12 18:15:01 UTC

[GitHub] [solr] cpoerschke opened a new pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

cpoerschke opened a new pull request #13:
URL: https://github.com/apache/solr/pull/13


   Allow with-minExactCount searches to use query cache entries from without-minExactCount searches.
   
   https://issues.apache.org/jira/browse/SOLR-14690
   
   (Transferred from draft https://github.com/apache/lucene-solr/pull/1530 pull request to here.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [solr] cpoerschke commented on pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

Posted by GitBox <gi...@apache.org>.
cpoerschke commented on pull request #13:
URL: https://github.com/apache/solr/pull/13#issuecomment-881654422


   w.r.t. Solr Ref Guide content updating -- https://solr.apache.org/guide/8_9/common-query-parameters.html#minexactcount-parameter -- I've considered this but couldn't see an obvious way of reflecting this change and in a way it's implementation detail anyhow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] madrob commented on pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

Posted by GitBox <gi...@apache.org>.
madrob commented on pull request #13:
URL: https://github.com/apache/solr/pull/13#issuecomment-880037020


   Would it be more straightforward to use a solrconfig.xml that has auto warm count set to zero on the caches of interest?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] cpoerschke commented on a change in pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

Posted by GitBox <gi...@apache.org>.
cpoerschke commented on a change in pull request #13:
URL: https://github.com/apache/solr/pull/13#discussion_r671387779



##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -1304,7 +1314,14 @@ private void getDocListC(QueryResult qr, QueryCommand cmd) throws IOException {
       // so set all of them on the cache key.
       key = new QueryResultKey(q, cmd.getFilterList(), cmd.getSort(), flags, cmd.getMinExactCount());
       if ((flags & NO_CHECK_QCACHE) == 0) {
-        superset = queryResultCache.get(key);
+        final QueryResultCacheEntry qrce = queryResultCache.get(key);
+        if (qrce != null &&
+            qrce.docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO &&
+            qrce.minExactCount >= key.nc_minExactCount) {

Review comment:
       So the
   
   ```
   qrce.docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO &&
   qrce.minExactCount >= key.nc_minExactCount) {
   ```
   
   logic here matches the _"a query with a minExactCount restriction being able to use a cache entry from a query without the restriction."_ goal on the JIRA ticket (because queries without a restriction will result in a `EQUAL_TO` hit count relation and because queries without a restriction have a `Integer.MAX_VALUE` min exact count value).
   
   For the _"a minExactCount=100 query being able to use a minExactCount=1000 cache entry"_ goal on the JIRA ticket
   
   ```
   qrce.docList.hitCountRelation() == TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO &&
   qrce.minExactCount >= key.nc_minExactCount) {
   ```
   
   seems intuitively right.
   
   But upon closer consideration I think this would mean that a query without a minExactCount restriction could use a cache entry with a `GREATER_THAN_OR_EQUAL_TO` hit count relation (if `qrce.minExactCount` and `key.nc_minExactCount` are both `Integer.MAX_VALUE`) -- could such a cache entry exist though?
   * I note that `nDocs` requested is an `int` but that `TotalHits.value` is a `long` as per https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.9.0/lucene/core/src/java/org/apache/lucene/search/TotalHits.java#L47 since 7.0 via https://issues.apache.org/jira/browse/LUCENE-7872 ticket.
   * Logically even if Solr requests an `int` quantity of results then Lucene could count up a `long` quantity of matches, so yes non-EQUAL_TO hit count relation cache entry use for a query without a minExactCount restriction looks to be a possibility theoretically.
   * Practically the numHits and totalHitsThreshold in Lucene's TopFieldCollector are currently `int` and so there's nothing to worry about here I think and it should be possible to simplify the logic to
   
   ```
   (qrce.docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO ||
    qrce.docList.hitCountRelation() == TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO) &&
   qrce.minExactCount >= key.nc_minExactCount) {
   ```
   
   or
   
   ```
   qrce.minExactCount >= key.nc_minExactCount) {
   ```
   
   with suitable test coverage of course.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] cpoerschke commented on pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

Posted by GitBox <gi...@apache.org>.
cpoerschke commented on pull request #13:
URL: https://github.com/apache/solr/pull/13#issuecomment-879375401


   Via `./gradlew :solr:core:test --tests "org.apache.solr.search.SolrIndexSearcherTest" -Ptests.jvms=4 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=B773ACC2C581EBBA -Ptests.file.encoding=UTF-8` two tests fail, but running them individually does not reproduce. My initial suspicion would be interaction between the two tests.
   
   ```
   ERROR: The following test(s) have failed:
     - org.apache.solr.search.SolrIndexSearcherTest.testLowMinExactCountWithQueryResultCache (:solr:core)
       Test output: /Users/cpoerschke/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.search.SolrIndexSearcherTest.txt
       Reproduce with: gradlew :solr:core:test --tests "org.apache.solr.search.SolrIndexSearcherTest.testLowMinExactCountWithQueryResultCache" -Ptests.jvms=4 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=B773ACC2C581EBBA -Ptests.file.encoding=UTF-8
   
     - org.apache.solr.search.SolrIndexSearcherTest.testMinExactCount (:solr:core)
       Test output: /Users/cpoerschke/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.search.SolrIndexSearcherTest.txt
       Reproduce with: gradlew :solr:core:test --tests "org.apache.solr.search.SolrIndexSearcherTest.testMinExactCount" -Ptests.jvms=4 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=B773ACC2C581EBBA -Ptests.file.encoding=UTF-8
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] madrob commented on pull request #13: SOLR-14690: QueryResultKey.[nc_]minExactCount

Posted by GitBox <gi...@apache.org>.
madrob commented on pull request #13:
URL: https://github.com/apache/solr/pull/13#issuecomment-879454399


   I tried to isolate this and it looks like something is happening in `testHighMinExactCountWithQueryResultCache` that is interfering with the other tests, like maybe it is inserting too many new cache entries? For some reason the `commit` that happens in the `setup()` method might not be clearing the caches? Or auto warming settings are repopulating it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org