You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/09/16 09:57:38 UTC

[GitHub] [lucene-solr] jpountz commented on a change in pull request #1866: LUCENE-9523: Speed up query shapes for geometries that generate multiple points

jpountz commented on a change in pull request #1866:
URL: https://github.com/apache/lucene-solr/pull/1866#discussion_r489311556



##########
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##########
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, final Weight weight, fin
         final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
         return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
       }
-      final DocIdSetBuilder docIdSetBuilder = new DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-      values.intersect(getSparseVisitor(query, docIdSetBuilder));
-      final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-      return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+      if (values.getDocCount() << 2 < values.size()) {
+        // we use a dense structure so we can skip already visited documents
+        final FixedBitSet result = new FixedBitSet(reader.maxDoc());
+        final long[] cost = new long[]{reader.maxDoc()};

Review comment:
       We just need one long?
   ```suggestion
           final long[] cost = new long[]{1};
   ```

##########
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##########
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, final Weight weight, fin
         final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
         return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
       }
-      final DocIdSetBuilder docIdSetBuilder = new DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-      values.intersect(getSparseVisitor(query, docIdSetBuilder));
-      final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-      return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+      if (values.getDocCount() << 2 < values.size()) {

Review comment:
       I think we should be careful with overflows, maybe divide instead of multiplying, ie.
   ```suggestion
         if (values.getDocCount() < (values.size() >>> 2)) {
   ```

##########
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##########
@@ -265,10 +265,20 @@ private Scorer getSparseScorer(final LeafReader reader, final Weight weight, fin
         final DocIdSetIterator iterator = new BitSetIterator(result, cost[0]);
         return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
       }
-      final DocIdSetBuilder docIdSetBuilder = new DocIdSetBuilder(reader.maxDoc(), values, query.getField());
-      values.intersect(getSparseVisitor(query, docIdSetBuilder));
-      final DocIdSetIterator iterator = docIdSetBuilder.build().iterator();
-      return new ConstantScoreScorer(weight, boost, scoreMode, iterator);
+      if (values.getDocCount() << 2 < values.size()) {
+        // we use a dense structure so we can skip already visited documents
+        final FixedBitSet result = new FixedBitSet(reader.maxDoc());

Review comment:
       I wonder if we should use SparseFixedBitSet to avoid allocating so much memory at once.

##########
File path: lucene/core/src/java/org/apache/lucene/document/ShapeQuery.java
##########
@@ -340,7 +351,8 @@ public Relation compare(byte[] minTriangle, byte[] maxTriangle) {
     };
   }
 
-  /** create a visitor that adds documents that match the query using a sparse bitset. (Used by INTERSECT) */
+  /** create a visitor that adds documents that match the query using a sparse bitset. (Used by INTERSECT
+   * when the number of points <= 4 * number of docs ) */

Review comment:
       ```suggestion
      * when the number of docs <= 4 * number of points ) */
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org