You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by jp...@apache.org on 2022/05/20 08:12:48 UTC

[lucene] branch branch_9x updated: LUCENE-10574: Keep allowing unbalanced merges if they would reclaim lots of deletes. (#905)

This is an automated email from the ASF dual-hosted git repository.

jpountz pushed a commit to branch branch_9x
in repository https://gitbox.apache.org/repos/asf/lucene.git


The following commit(s) were added to refs/heads/branch_9x by this push:
     new e56f7a6336b LUCENE-10574: Keep allowing unbalanced merges if they would reclaim lots of deletes. (#905)
e56f7a6336b is described below

commit e56f7a6336b26adbf2a180a588240ffb75c47ff4
Author: Adrien Grand <jp...@gmail.com>
AuthorDate: Fri May 20 10:06:38 2022 +0200

    LUCENE-10574: Keep allowing unbalanced merges if they would reclaim lots of deletes. (#905)
    
    `TestTieredMergePolicy` caught this special case: if a segment has lots of
    deletes, we should still allow unbalanced merges.
---
 .../core/src/java/org/apache/lucene/index/TieredMergePolicy.java   | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java b/lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java
index d974148b4e4..0a98760b7ef 100644
--- a/lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java
+++ b/lucene/core/src/java/org/apache/lucene/index/TieredMergePolicy.java
@@ -536,11 +536,16 @@ public class TieredMergePolicy extends MergePolicy {
         SegmentSizeAndDocs maxCandidateSegmentSize = segInfosSizes.get(candidate.get(0));
         if (hitTooLarge == false
             && mergeType == MERGE_TYPE.NATURAL
-            && bytesThisMerge < maxCandidateSegmentSize.sizeInBytes * 1.5) {
+            && bytesThisMerge < maxCandidateSegmentSize.sizeInBytes * 1.5
+            && maxCandidateSegmentSize.delCount
+                < maxCandidateSegmentSize.maxDoc * deletesPctAllowed / 100) {
           // Ignore any merge where the resulting segment is not at least 50% larger than the
           // biggest input segment.
           // Otherwise we could run into pathological O(N^2) merging where merges keep rewriting
           // again and again the biggest input segment into a segment that is barely bigger.
+          // The only exception we make is when the merge would reclaim lots of deletes in the
+          // biggest segment. This is important for cases when lots of documents get deleted at once
+          // without introducing new segments of a similar size for instance.
           continue;
         }