You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/16 09:58:38 UTC

[GitHub] [lucene] pquentin opened a new pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

pquentin opened a new pull request #445:
URL: https://github.com/apache/lucene/pull/445


   # Description
   
   Now that we require all documents to use the same features (LUCENE-9334), implement `Weight#count()` to return `docCount` if either terms or points are indexed.
   
   # Solution
   
   After instructions from @jpountz, I've added `Weight#count()` that returns the document count if points are indexed.
   
   # Tests
   
   I've adapted `TestTermQuery#testQueryMatchesCount()` into `TestDocValuesFieldExistsQuery`. The test correctly exercises the `searcher.count` case, but `weight.count` returns -1 as `getPointDimensionCount` always returns 0. Why isn't my field indexed?
   
   # Checklist
   
   - [ ] Fix current test
   - [ ] Test terms in addition to points
   - [ ] Validate that this improves performance in a synthetic benchmark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751228998



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,52 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new NumericDocValuesfield("long", i));
+        doc.add(new StringField("string", "value", Store.NO));
+        doc.add(new SortedDocValuesField("string", new BytesRef("value"));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    assertSameCount(reader, searcher, "long", numMatchingDocs);
+    assertSameCount(reader, searcher, "string", numMatchingDocs);
+
+    // Test that we can't count in O(1) when there are deleted documents
+    w.deleteDocuments(LongPoint.newRangeQuery("long", 0L, 10L));
+    DirectoryReader reader2 = w.getReader();
+    final IndexSearcher searcher2 = new IndexSearcher(reader2);
+    final Query testQuery = new DocValuesFieldExistsQuery("long");
+    final Weight weight2 = searcher2.createWeight(testQuery, ScoreMode.COMPLETE, 1);
+    assertEquals(weight2.count(reader2.leaves().get(0)), -1);

Review comment:
       For what it's worth this is flaky, because `hasDeletions` does not always return true. Say I start with 15 docs, then remove 4, the reader sometimes thinks `maxDoc == numDoc == 11` somehow. So `hasDeletions` returns False, and I'm not getting -1 as expected. How can I fix this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751199436



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));

Review comment:
       Since this query is supposed to match documents that have doc values, we need to index doc values too.
   
   ```suggestion
           doc.add(new NumericDocValuesfield("long", i));
           doc.add(new StringField("string", "value", Store.NO));
           doc.add(new SortedDocValuesField("string", new BytesRef("value"));
   ```

##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    assertSameCount(reader, searcher, "long", numMatchingDocs);
+    assertSameCount(reader, searcher, "string", numMatchingDocs);

Review comment:
       Can you also test the case when a field doesn't exist?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r750954007



##########
File path: lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java
##########
@@ -74,6 +74,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
         return new ConstantScoreScorer(this, score(), scoreMode, iterator);
       }
 
+      @Override
+      public int count(LeafReaderContext context) throws IOException {
+        final LeafReader reader = context.reader();
+        FieldInfo fieldInfo = reader.getFieldInfos().fieldInfo(field);
+        if (fieldInfo == null || fieldInfo.getPointDimensionCount() == 0) {

Review comment:
       Thanks, I've done that (and tested it)!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751606167



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,52 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new NumericDocValuesfield("long", i));
+        doc.add(new StringField("string", "value", Store.NO));
+        doc.add(new SortedDocValuesField("string", new BytesRef("value"));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    assertSameCount(reader, searcher, "long", numMatchingDocs);
+    assertSameCount(reader, searcher, "string", numMatchingDocs);
+
+    // Test that we can't count in O(1) when there are deleted documents
+    w.deleteDocuments(LongPoint.newRangeQuery("long", 0L, 10L));
+    DirectoryReader reader2 = w.getReader();
+    final IndexSearcher searcher2 = new IndexSearcher(reader2);
+    final Query testQuery = new DocValuesFieldExistsQuery("long");
+    final Weight weight2 = searcher2.createWeight(testQuery, ScoreMode.COMPLETE, 1);
+    assertEquals(weight2.count(reader2.leaves().get(0)), -1);

Review comment:
       Thanks, that fixed it. Now I can run the test 1000 times without any failures.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751243685



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,52 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new NumericDocValuesfield("long", i));
+        doc.add(new StringField("string", "value", Store.NO));
+        doc.add(new SortedDocValuesField("string", new BytesRef("value"));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    assertSameCount(reader, searcher, "long", numMatchingDocs);
+    assertSameCount(reader, searcher, "string", numMatchingDocs);
+
+    // Test that we can't count in O(1) when there are deleted documents
+    w.deleteDocuments(LongPoint.newRangeQuery("long", 0L, 10L));
+    DirectoryReader reader2 = w.getReader();
+    final IndexSearcher searcher2 = new IndexSearcher(reader2);
+    final Query testQuery = new DocValuesFieldExistsQuery("long");
+    final Weight weight2 = searcher2.createWeight(testQuery, ScoreMode.COMPLETE, 1);
+    assertEquals(weight2.count(reader2.leaves().get(0)), -1);

Review comment:
       I suspect that it's because the test picks a merge policy that decides that the index has too many deletions and that the segment should get rewritten.You could do `w.w.getConfig().setMergePolicy(NoMergePolicy.INSTANCE)` after the `forcemerge` to prevent the call to `deleteTerms` from triggering a merge.

##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));

Review comment:
       Year this made me wonder if we should update your new count impl to do this:
   
   ```
   if (fieldInfo == null || fieldInfo.getDocValuesType() == DocValuesType.NONE) {
     return 0; // the field doesn't index doc values
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r750295570



##########
File path: lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java
##########
@@ -74,6 +74,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
         return new ConstantScoreScorer(this, score(), scoreMode, iterator);
       }
 
+      @Override
+      public int count(LeafReaderContext context) throws IOException {
+        final LeafReader reader = context.reader();
+        FieldInfo fieldInfo = reader.getFieldInfos().fieldInfo(field);
+        if (fieldInfo == null || fieldInfo.getPointDimensionCount() == 0) {
+          return -1;

Review comment:
       Nit: In these cases we like to fall back to the default implementation, this way we don't have to know about what the contract is for the case when the count cannot be computed.
   ```suggestion
             return super.count(context);
   ```

##########
File path: lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java
##########
@@ -74,6 +74,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
         return new ConstantScoreScorer(this, score(), scoreMode, iterator);
       }
 
+      @Override
+      public int count(LeafReaderContext context) throws IOException {
+        final LeafReader reader = context.reader();
+        FieldInfo fieldInfo = reader.getFieldInfos().fieldInfo(field);
+        if (fieldInfo == null || fieldInfo.getPointDimensionCount() == 0) {

Review comment:
       You also need to take care of the case when there are deletions on the segment, see how `TermWeight` handles it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751226805



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));

Review comment:
       Thanks, this makes sense, but why isn't the test failing without those?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751234461



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    assertSameCount(reader, searcher, "long", numMatchingDocs);
+    assertSameCount(reader, searcher, "string", numMatchingDocs);

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on pull request #445:
URL: https://github.com/apache/lucene/pull/445#issuecomment-973896793


   Thank you @jpountz! Done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz merged pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz merged pull request #445:
URL: https://github.com/apache/lucene/pull/445


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751011311



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,42 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new NumericDocValuesField("f", i));
+        doc.add(new LongPoint("f", i));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    Query testQuery = new DocValuesFieldExistsQuery("f");
+    assertEquals(searcher.count(testQuery), numMatchingDocs);
+    final Weight weight = searcher.createWeight(testQuery, ScoreMode.COMPLETE, 1);
+    assertEquals(weight.count(reader.leaves().get(0)), numMatchingDocs);
+
+    // Test that we can't count in O(1) when there are deleted documents
+    w.deleteDocuments(LongPoint.newRangeQuery("f", 0L, 10L));
+    w.commit();

Review comment:
       FWIW there is no need to commit, the call to `getReader` below will make sure to make changes visible.

##########
File path: lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java
##########
@@ -74,6 +74,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
         return new ConstantScoreScorer(this, score(), scoreMode, iterator);
       }
 
+      @Override
+      public int count(LeafReaderContext context) throws IOException {
+        final LeafReader reader = context.reader();
+        final FieldInfo fieldInfo = reader.getFieldInfos().fieldInfo(field);
+        if (reader.hasDeletions() || fieldInfo == null || fieldInfo.getPointDimensionCount() == 0) {

Review comment:
       To handle terms, you need to check than `fieldInfo.getIndexOptions()` to know if the field is indexed. And then you can do `reader.terms(field).getDocCount()`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r750955229



##########
File path: lucene/core/src/java/org/apache/lucene/search/DocValuesFieldExistsQuery.java
##########
@@ -74,6 +74,16 @@ public Scorer scorer(LeafReaderContext context) throws IOException {
         return new ConstantScoreScorer(this, score(), scoreMode, iterator);
       }
 
+      @Override
+      public int count(LeafReaderContext context) throws IOException {
+        final LeafReader reader = context.reader();
+        final FieldInfo fieldInfo = reader.getFieldInfos().fieldInfo(field);
+        if (reader.hasDeletions() || fieldInfo == null || fieldInfo.getPointDimensionCount() == 0) {

Review comment:
       This only handles points, right? How can I handle fields, as requested in the JIRA issue?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r750953697



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -17,17 +17,17 @@
 package org.apache.lucene.search;
 
 import java.io.IOException;
-import org.apache.lucene.document.Document;
+import org.apache.lucene.document.*;

Review comment:
       Thanks, I was wondering about this: it was done automatically by Intellij IDEA and I've seen a few star imports in the code base. I reverted it and asked the IDE to stop doing it. It would be nice if spotless prevented that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r753067107



##########
File path: lucene/CHANGES.txt
##########
@@ -33,7 +33,10 @@ Other
 
 API Changes
 ---------------------
-(No changes)
+
+* LUCENE-10085: Added Weight#count on DocValuesFieldExistsQuery to speed up the query if terms or
+  points are indexed.
+  (Quentin Pradet, Adrien Grand)

Review comment:
       Oops, I modeled this after the original Weight#count, but of course that was an actual new API. Thanks, fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751102338



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,42 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new NumericDocValuesField("f", i));
+        doc.add(new LongPoint("f", i));
+        numMatchingDocs++;
+      }
+      w.addDocument(doc);
+    }
+    w.forceMerge(1);
+
+    DirectoryReader reader = w.getReader();
+    final IndexSearcher searcher = new IndexSearcher(reader);
+
+    Query testQuery = new DocValuesFieldExistsQuery("f");
+    assertEquals(searcher.count(testQuery), numMatchingDocs);
+    final Weight weight = searcher.createWeight(testQuery, ScoreMode.COMPLETE, 1);
+    assertEquals(weight.count(reader.leaves().get(0)), numMatchingDocs);
+
+    // Test that we can't count in O(1) when there are deleted documents
+    w.deleteDocuments(LongPoint.newRangeQuery("f", 0L, 10L));
+    w.commit();

Review comment:
       Thanks, fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751606710



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));

Review comment:
       Indeed, with this change the doc values are needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r753016736



##########
File path: lucene/CHANGES.txt
##########
@@ -33,7 +33,10 @@ Other
 
 API Changes
 ---------------------
-(No changes)
+
+* LUCENE-10085: Added Weight#count on DocValuesFieldExistsQuery to speed up the query if terms or
+  points are indexed.
+  (Quentin Pradet, Adrien Grand)

Review comment:
       IMO it belongs more to `New Features`, or possible `Optimizations` since it is not introducing a new API but implementing an existing API on an existing query.

##########
File path: lucene/CHANGES.txt
##########
@@ -33,7 +33,10 @@ Other
 
 API Changes
 ---------------------
-(No changes)
+
+* LUCENE-10085: Added Weight#count on DocValuesFieldExistsQuery to speed up the query if terms or
+  points are indexed.
+  (Quentin Pradet, Adrien Grand)

Review comment:
       IMO it belongs more to `New Features`, or possibly `Optimizations` since it is not introducing a new API but implementing an existing API on an existing query.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
jpountz commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r750644758



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -17,17 +17,17 @@
 package org.apache.lucene.search;
 
 import java.io.IOException;
-import org.apache.lucene.document.Document;
+import org.apache.lucene.document.*;

Review comment:
       We prefer to avoid star imports, which might put more classes than desirable in the local namespace.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on pull request #445:
URL: https://github.com/apache/lucene/pull/445#issuecomment-974322287


   @jpountz Thanks a lot for your help throughout the week! Your help was invaluable, both privately and on GitHub/JIRA. I basically transcribed what you said to me, and it worked. :) I still learned a lot about Lucene in the process. The gradle tooling is great and really easy to use.
   
   :bowing_man: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] pquentin commented on a change in pull request #445: LUCENE-10085: Implement Weight#count on DocValuesFieldExistsQuery

Posted by GitBox <gi...@apache.org>.
pquentin commented on a change in pull request #445:
URL: https://github.com/apache/lucene/pull/445#discussion_r751606710



##########
File path: lucene/core/src/test/org/apache/lucene/search/TestDocValuesFieldExistsQuery.java
##########
@@ -206,6 +210,50 @@ public void testFieldExistsButNoDocsHaveField() throws IOException {
     dir.close();
   }
 
+  public void testQueryMatchesCount() throws IOException {
+    Directory dir = newDirectory();
+    RandomIndexWriter w = new RandomIndexWriter(random(), dir);
+
+    int randomNumDocs = TestUtil.nextInt(random(), 10, 100);
+    int numMatchingDocs = 0;
+
+    for (int i = 0; i < randomNumDocs; i++) {
+      Document doc = new Document();
+      if (random().nextBoolean()) {
+        doc.add(new LongPoint("long", i));
+        doc.add(new StringField("string", "value", Store.NO));

Review comment:
       Indeed, with this change the doc values are now needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org