You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/12/14 12:06:02 UTC

[GitHub] [lucene] jpountz commented on a diff in pull request #12017: Aggressive `count` in BooleanWeight

jpountz commented on code in PR #12017:
URL: https://github.com/apache/lucene/pull/12017#discussion_r1048383541


##########
lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java:
##########
@@ -470,14 +470,19 @@ private int reqCount(LeafReaderContext context) throws IOException {
   private int optCount(LeafReaderContext context, Occur occur) throws IOException {
     final int numDocs = context.reader().numDocs();
     int optCount = 0;
+    boolean unknownCount = false;
     for (WeightedBooleanClause weightedClause : weightedClauses) {
       if (weightedClause.clause.getOccur() != occur) {
         continue;
       }
       int count = weightedClause.weight.count(context);
-      if (count == -1 || count == numDocs) {
-        // If any of the clauses has a number of matches that is unknown, the number of matches of
-        // the disjunction is unknown.
+      if (count == -1) {
+        // If one clause has a number of matches that is unknown, let's be more aggressive to check
+        // whether remain clauses could match all docs.
+        unknownCount = true;
+        continue;

Review Comment:
   use an `else if` instead of `continue` for consistency with how other cases are handled?



##########
lucene/core/src/java/org/apache/lucene/search/BooleanWeight.java:
##########
@@ -492,7 +497,10 @@ private int optCount(LeafReaderContext context, Occur occur) throws IOException
         return -1;

Review Comment:
   Maybe this line should also set `unkwownCount = true` instead of returning `-1` right away?



##########
lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java:
##########
@@ -1015,6 +1015,80 @@ public void testDisjunctionRandomClausesMatchesCount() throws Exception {
     }
   }
 
+  public void testAggressiveMatchCount() throws IOException {

Review Comment:
   Fold these tests into the existing `testDisjunctionMatchesCount` test?



##########
lucene/core/src/test/org/apache/lucene/search/TestBooleanQuery.java:
##########
@@ -1015,6 +1015,80 @@ public void testDisjunctionRandomClausesMatchesCount() throws Exception {
     }
   }
 
+  public void testAggressiveMatchCount() throws IOException {
+    Directory dir = newDirectory();
+    IndexWriter writer = new IndexWriter(dir, new IndexWriterConfig());
+    Document doc = new Document();
+    LongPoint longPoint = new LongPoint("long", 3L, 4L, 5L);
+    doc.add(longPoint);
+    StringField stringField = new StringField("string", "abc", Store.NO);
+    doc.add(stringField);
+    writer.addDocument(doc);
+    longPoint.setLongValues(10L, 11L, 12L);
+    stringField.setStringValue("xyz");
+    writer.addDocument(doc);
+
+    IndexReader reader = DirectoryReader.open(writer);
+    writer.close();
+    IndexSearcher searcher = new IndexSearcher(reader);
+
+    long[] lower = new long[] {4L, 5L, 6L};
+    long[] upper = new long[] {9L, 10L, 11L};
+    Query unknownCountQuery = LongPoint.newRangeQuery("long", lower, upper);
+    assert reader.leaves().size() == 1;
+    assert searcher
+            .createWeight(unknownCountQuery, ScoreMode.COMPLETE, 1f)
+            .count(reader.leaves().get(0))
+        == -1;
+
+    Query query =
+        new BooleanQuery.Builder()
+            .add(new TermQuery(new Term("string", "xyz")), Occur.MUST)
+            .add(unknownCountQuery, Occur.MUST_NOT)
+            .add(new MatchAllDocsQuery(), Occur.MUST_NOT)
+            .build();
+    Weight weight = searcher.createWeight(query, ScoreMode.COMPLETE, 1f);
+    // count of the first MUST_NOT clause is unknown, but the second MUST_NOT clause matches all
+    // docs
+    assertEquals(0, weight.count(reader.leaves().get(0)));
+
+    query =
+        new BooleanQuery.Builder()
+            .add(new TermQuery(new Term("string", "xyz")), Occur.MUST)
+            .add(unknownCountQuery, Occur.MUST_NOT)
+            .add(new TermQuery(new Term("string", "abc")), Occur.MUST_NOT)
+            .build();
+    weight = searcher.createWeight(query, ScoreMode.COMPLETE, 1f);
+    // count of the first MUST_NOT clause is unknown, though the second MUST_NOT clause matche one
+    // doc, we can't figure out the number of
+    // docs
+    assertEquals(-1, weight.count(reader.leaves().get(0)));
+
+    // test pure disConjunction

Review Comment:
   ```suggestion
       // test pure disjunction
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org