You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/12/22 13:36:46 UTC

[GitHub] [lucene] ayakolesnikov opened a new issue, #12032: Exception rising while using QueryTimeout

ayakolesnikov opened a new issue, #12032:
URL: https://github.com/apache/lucene/issues/12032

   ### Description
   
   ```
   import org.apache.lucene.facet.DrillDownQuery;
   import org.apache.lucene.facet.DrillSideways;
   import org.apache.lucene.facet.FacetsConfig;
   import org.apache.lucene.facet.taxonomy.TaxonomyReader;
   import org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader;
   import org.apache.lucene.index.DirectoryReader;
   import org.apache.lucene.index.IndexReader;
   import org.apache.lucene.index.QueryTimeoutImpl;
   import org.apache.lucene.search.CollectorManager;
   import org.apache.lucene.search.IndexSearcher;
   import org.apache.lucene.search.Sort;
   import org.apache.lucene.search.TopFieldCollector;
   import org.apache.lucene.search.TopFieldDocs;
   import org.apache.lucene.store.NIOFSDirectory;
   import org.apache.lucene.store.SimpleFSLockFactory;
   import org.junit.Test;
   
   import java.io.IOException;
   import java.nio.file.Path;
   
   public class BugTest {
   
      @Test
      public void test() throws IOException {
         TaxonomyReader taxonomyReader = new DirectoryTaxonomyReader(new NIOFSDirectory(Path.of(
               "src/test/resources/fs-base/root/search/test/221031_130517/categories"), SimpleFSLockFactory.INSTANCE));
         FacetsConfig facetsConfig = new FacetsConfig();
   
         final DrillDownQuery facetedQuery = new DrillDownQuery(facetsConfig);
         facetedQuery.add("dataclass", "STD"); // if we add dim
   
         IndexReader indexReader = DirectoryReader.open(new NIOFSDirectory(Path.of(
               "src/test/resources/fs-base/root/search/test/221031_130517/main")));
         IndexSearcher indexSearcher = new IndexSearcher(indexReader);
         indexSearcher.setTimeout(new QueryTimeoutImpl(1000000)); // and timeout
   
         DrillSideways drillSideways = new DrillSideways(indexSearcher, facetsConfig, taxonomyReader);
   
         final CollectorManager<TopFieldCollector, TopFieldDocs> collectorManager =
               TopFieldCollector.createSharedManager(Sort.RELEVANCE, 100, null, Integer.MAX_VALUE);
   
         final DrillSideways.ConcurrentDrillSidewaysResult<TopFieldDocs> r = drillSideways.search(facetedQuery, collectorManager);
         // exception is thrown
      }
   }
   ```
   
   ```
   java.lang.IllegalArgumentException: maxDoc must be Integer.MAX_VALUE
   
   	at org.apache.lucene.facet.DrillSidewaysScorer.score(DrillSidewaysScorer.java:84)
   	at org.apache.lucene.search.TimeLimitingBulkScorer.score(TimeLimitingBulkScorer.java:68)
   	at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:38)
   	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:744)
   	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:662)
   	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:656)
   	at org.apache.lucene.facet.DrillSideways.searchSequentially(DrillSideways.java:510)
   	at org.apache.lucene.facet.DrillSideways.search(DrillSideways.java:446)
   ```
   I don't know why there is that restriction. Maybe it is possible just delete
   
   ### Version and environment details
   
   Lucene 9.4.2


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [I] Exception rising while using QueryTimeout [lucene]

Posted by "msfroh (via GitHub)" <gi...@apache.org>.
msfroh commented on issue #12032:
URL: https://github.com/apache/lucene/issues/12032#issuecomment-1763058013

   I was looking into this, and the fundamental problem seems to be that the underlying drillsideways scoring implementations (`doQueryFirstScoring`, `doDrillDownAdvanceScoring`, and `doUnionScoring`) each assume that they're going to score through the whole segment (such that they don't play nicely with query timeout).
   
   `doDrillDownAdvanceScoring` and `doUnionScoring` both explicitly set:
   ```
   final int maxDoc = context.reader().maxDoc();
   ```
   
   While `doQueryFirstScoring` and the special-case `doQueryFirstScoringSingleDim` both have `while` loops that finish once there are no more docs:
   
   ```
   while (docID != DocIdSetIterator.NO_MORE_DOCS) {
   ```
   
   So, the exception there is probably not such a bad idea because DrillSidewaysScorer *doesn't* seem to play nicely with `TimeLimitingBulkScorer` -- at least it doesn't honor the `maxDoc` override that `TimeLimitingBulkScorer` imposes.
   
   On the other hand, maybe we could just pass `maxDoc` through to the underlying implementations. @gsmiller, do you know if there's any danger from terminating `DrillSidewaysScorer` before hitting the end of a segment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


Re: [I] Exception rising while using QueryTimeout [lucene]

Posted by "msfroh (via GitHub)" <gi...@apache.org>.
msfroh commented on issue #12032:
URL: https://github.com/apache/lucene/issues/12032#issuecomment-1765587096

   I started to work on making DrillSidewaysScorer work on windows of doc IDs, when I noticed the following comment added in TestDrillSideways as part of https://github.com/apache/lucene/pull/996/files: 
   
   ```
       // DrillSideways requires the entire range of docs to be scored at once, so it doesn't support
       // timeouts whose implementation scores one window of doc IDs at a time.
   ```
   
   Another challenge that I noticed is that the recent change to call `finish` on collectors only after they've finished would require some more changes, as `drillDownLeafCollector` and any `drillSidewaysLeafCollectors` would have their `finish` method called after a single window.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org