You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/05/26 02:02:45 UTC

[GitHub] [lucene] msokolov commented on a change in pull request #142: LUCENE-9944: Allow DrillSideways users to pass a CollectorManager without requiring an ExecutorService (and concurrent DrillSideways approach).

msokolov commented on a change in pull request #142:
URL: https://github.com/apache/lucene/pull/142#discussion_r639346222



##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java
##########
@@ -40,21 +41,65 @@
 // TODO change the way DrillSidewaysScorer is used, this query does not work
 // with filter caching
 class DrillSidewaysQuery extends Query {
+
   final Query baseQuery;
-  final Collector drillDownCollector;
-  final Collector[] drillSidewaysCollectors;
+
+  // The caller must either directly provide FacetsCollectors for the drill down and sideways
+  // collecting, or provide FacetsCollectorManagers used to create the FacetsCollectors. If
+  // this query will be executed concurrently, FacetsCollectorManagers should be used to ensure

Review comment:
       can we raise an exception if this happens?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java
##########
@@ -131,8 +185,24 @@ public boolean isCacheable(LeafReaderContext ctx) {
       public BulkScorer bulkScorer(LeafReaderContext context) throws IOException {
         Scorer baseScorer = baseWeight.scorer(context);
 
+        int drillDownCount = drillDowns.length;
+
+        // TODO: If the caller provided a FacetsCollectorManager instead of directly providing
+        // FacetsCollectors, we assume this will be invoked during a concurrent search. Ideally
+        // we'd only create new FacetsCollectors for each "leaf slice" that will be concurrently
+        // searched, as opposed to each actual leaf, but we don't have that information at this
+        // level so we always provide a new FacetsCollector. There might be a better way to
+        // refactor this logic.

Review comment:
       I think in the case of collecting results, we create the collectors in `IndexSearcher.search(Query query, CollectorManager<C, T> collectorManager)`, which knows about the slices. If we want to improve this, we could add a new `IndexSearcher.search` accepting a DrilldownQuery, but maybe that's something for another day?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java
##########
@@ -40,21 +41,65 @@
 // TODO change the way DrillSidewaysScorer is used, this query does not work
 // with filter caching
 class DrillSidewaysQuery extends Query {
+
   final Query baseQuery;
-  final Collector drillDownCollector;
-  final Collector[] drillSidewaysCollectors;
+
+  // The caller must either directly provide FacetsCollectors for the drill down and sideways
+  // collecting, or provide FacetsCollectorManagers used to create the FacetsCollectors. If
+  // this query will be executed concurrently, FacetsCollectorManagers should be used to ensure
+  // multiple threads aren't collecting into the same FacetsCollector.
+  final FacetsCollector drillDownCollector;
+  final FacetsCollector[] drillSidewaysCollectors;
+  final FacetsCollectorManager drillDownCollectorManager;
+  final FacetsCollectorManager[] drillSidewaysCollectorManagers;
+  final List<FacetsCollector> managedDrillDownCollectors;
+  final List<FacetsCollector[]> managedDrillSidewaysCollectors;
+
   final Query[] drillDownQueries;
+
   final boolean scoreSubDocsAtOnce;
 
+  /**
+   * Construct a new {@code DrillSidewaysQuery} that will directly use the provided {@link
+   * FacetsCollector}s. Use this if you're certain that the query will not be executed concurrently.

Review comment:
       Can we give users an idea of the tradeoff here? Naive user will like to execute concurrently if it goes faster, but would prefer single-threaded execution ... if they want to provide a custom collector? Some other reason?

##########
File path: lucene/facet/src/java/org/apache/lucene/facet/DrillSidewaysQuery.java
##########
@@ -142,7 +212,19 @@ public BulkScorer bulkScorer(LeafReaderContext context) throws IOException {
                 new ConstantScoreScorer(drillDowns[dim], 0f, scoreMode, DocIdSetIterator.empty());
           }
 
-          dims[dim] = new DrillSidewaysScorer.DocsAndCost(scorer, drillSidewaysCollectors[dim]);
+          // If the caller directly provided FacetsCollectors to use for the sideways dimensions,

Review comment:
       Could we simplify the API here by eliminating the two different calling methods and instead always requiring a CollectorManager?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org