You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/06/08 09:39:51 UTC

[GitHub] [lucene] jpountz commented on a diff in pull request #935: LUCENE-10599: Improve LogMergePolicy's handling of maxMergeSize.

jpountz commented on code in PR #935:
URL: https://github.com/apache/lucene/pull/935#discussion_r892145811


##########
lucene/core/src/java/org/apache/lucene/index/LogMergePolicy.java:
##########
@@ -568,23 +568,41 @@ public MergeSpecification findMerges(
       // Finally, record all merges that are viable at this level:
       int end = start + mergeFactor;
       while (end <= 1 + upto) {
-        boolean anyTooLarge = false;
         boolean anyMerging = false;
+        long mergeSize = 0;
+        long mergeDocs = 0;
         for (int i = start; i < end; i++) {
           final SegmentInfoAndLevel segLevel = levels.get(i);
           final SegmentCommitInfo info = segLevel.info;
-          anyTooLarge |=
-              (size(info, mergeContext) >= maxMergeSize
-                  || sizeDocs(info, mergeContext) >= maxMergeDocs);
           if (mergingSegments.contains(info)) {
             anyMerging = true;
             break;
           }
+          long segmentSize = size(info, mergeContext);
+          long segmentDocs = sizeDocs(info, mergeContext);
+          if (mergeSize + segmentSize > maxMergeSize || mergeDocs + segmentDocs > maxMergeDocs) {
+            // This merge is full, stop adding more segments to it
+            if (i == start) {
+              // This segment alone is too large, return a singleton merge
+              if (verbose(mergeContext)) {
+                message(
+                    "    " + i + " is larger than the max merge size/docs; ignoring", mergeContext);
+              }
+              end = i + 1;
+            } else {
+              // Previous segments are under the max merge size, return them
+              end = i;
+            }
+            break;
+          }
+          mergeSize += segmentSize;
+          mergeDocs += segmentDocs;
         }
 
-        if (anyMerging) {
-          // skip
-        } else if (!anyTooLarge) {
+        if (anyMerging || end - start <= 1) {
+          // skip: there is an ongoing merge at the current level or the computed merge has a single
+          // segment and this merge policy doesn't do singleton merges
+        } else {

Review Comment:
   Correct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org