You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jp...@apache.org on 2023/02/27 18:05:32 UTC

[lucene] branch branch_9x updated (8e08036e411 -> 5a32665b106)

This is an automated email from the ASF dual-hosted git repository.

jpountz pushed a change to branch branch_9x
in repository https://gitbox.apache.org/repos/asf/lucene.git


    from 8e08036e411 Minor vector search matching doc optimizations (#12152)
     new ba32642e66a Remove LogMergePolicy's boundary at the floor level. (#12113)
     new 5a32665b106 Lazily resolve ordinals when merging. (#12170)

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../org/apache/lucene/codecs/DocValuesConsumer.java  | 20 ++++++++++----------
 .../java/org/apache/lucene/index/LogMergePolicy.java |  5 -----
 2 files changed, 10 insertions(+), 15 deletions(-)


[lucene] 02/02: Lazily resolve ordinals when merging. (#12170)

Posted by jp...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jpountz pushed a commit to branch branch_9x
in repository https://gitbox.apache.org/repos/asf/lucene.git

commit 5a32665b1062636d707b30cde12f076505f4db51
Author: Adrien Grand <jp...@gmail.com>
AuthorDate: Mon Feb 27 17:38:59 2023 +0100

    Lazily resolve ordinals when merging. (#12170)
    
    The default implementation of merging doc values resolves the ordinal of a
    document in `nextDoc()`. But sometimes, doc values iterators are consumed
    without retrieving ordinals, e.g. to write the set of documents that have a
    value, so this may be wasteful.
    
    With this change, ordinals get resolved lazily upon `ordValue()`.
---
 .../org/apache/lucene/codecs/DocValuesConsumer.java  | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java b/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java
index 8795d52fe1b..4b494a042e2 100644
--- a/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java
+++ b/lucene/core/src/java/org/apache/lucene/codecs/DocValuesConsumer.java
@@ -709,7 +709,7 @@ public abstract class DocValuesConsumer implements Closeable {
 
     return new SortedDocValues() {
       private int docID = -1;
-      private int ord;
+      private SortedDocValuesSub current;
 
       @Override
       public int docID() {
@@ -718,20 +718,20 @@ public abstract class DocValuesConsumer implements Closeable {
 
       @Override
       public int nextDoc() throws IOException {
-        SortedDocValuesSub sub = docIDMerger.next();
-        if (sub == null) {
-          return docID = NO_MORE_DOCS;
+        current = docIDMerger.next();
+        if (current == null) {
+          docID = NO_MORE_DOCS;
+        } else {
+          docID = current.mappedDocID;
         }
-        int subOrd = sub.values.ordValue();
-        assert subOrd != -1;
-        ord = (int) sub.map.get(subOrd);
-        docID = sub.mappedDocID;
         return docID;
       }
 
       @Override
-      public int ordValue() {
-        return ord;
+      public int ordValue() throws IOException {
+        int subOrd = current.values.ordValue();
+        assert subOrd != -1;
+        return (int) current.map.get(subOrd);
       }
 
       @Override


[lucene] 01/02: Remove LogMergePolicy's boundary at the floor level. (#12113)

Posted by jp...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

jpountz pushed a commit to branch branch_9x
in repository https://gitbox.apache.org/repos/asf/lucene.git

commit ba32642e66a6fac64ee32148e1f145fc1e19946d
Author: Adrien Grand <jp...@gmail.com>
AuthorDate: Mon Feb 27 17:38:34 2023 +0100

    Remove LogMergePolicy's boundary at the floor level. (#12113)
    
    `LogMergePolicy` has this boundary at the floor level that prevents merging
    segments above the minimum segment size with segments below this size. I cannot
    see a benefit from doing this, and no tests fail if I remove it, while this
    boundary has the downside of not running merges that seem legit to me. Should
    we remove this boundary check?
---
 lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java b/lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java
index f78833acded..f7c5011d9c6 100644
--- a/lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java
+++ b/lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java
@@ -554,11 +554,6 @@ public abstract class LogMergePolicy extends MergePolicy {
         // With a merge factor of 10, this means that the biggest segment and the smallest segment
         // that take part of a merge have a size difference of at most 5.6x.
         levelBottom = (float) (maxLevel - LEVEL_LOG_SPAN);
-
-        // Force a boundary at the level floor
-        if (levelBottom < levelFloor && maxLevel >= levelFloor) {
-          levelBottom = levelFloor;
-        }
       } else {
         // For segments below the floor size, we allow more unbalanced merges, but still somewhat
         // balanced to avoid running into O(n^2) merging.