You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2010/10/15 14:46:32 UTC

[jira] Commented: (LUCENE-2701) Factor maxMergeSize into findMergesForOptimize in LogMergePolicy

    [ https://issues.apache.org/jira/browse/LUCENE-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921329#action_12921329 ] 

Michael McCandless commented on LUCENE-2701:
--------------------------------------------

Patch looks good!

Maybe rename OneMerge.totalSize -> totalSizeInBytes?  Hmm does anyone
actually call this new method?

Maybe note somewhere that now optimize (when there's a maxMergeDocs/MB
constraint) is able to merge fewer than mergeFactor segments at a
time?

This code is a bit confusing:

{noformat}
       if (last - start - 1 > 1) {
         // there is more than 1 segment to the right of this one.
         spec.add(new OneMerge(infos.range(start + 1, last), useCompoundFile));
       } else if (start != last - 1 && !isOptimized(infos.info(start + 1))) {
          spec.add(new OneMerge(infos.range(start + 1, last), useCompoundFile));
       }
{noformat}

Both if clauses are doing the same thing right?  (Ie merging the chunk
of segs to the right). Maybe put a comment explaining the 2nd one?  (I
think it's for the case where there's 1 segment to our right but it's
not optimized, eg the CFS differs?).  Or maybe consolidate into a single
if?


> Factor maxMergeSize into findMergesForOptimize in LogMergePolicy
> ----------------------------------------------------------------
>
>                 Key: LUCENE-2701
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2701
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1, 4.0
>
>         Attachments: LUCENE-2701.patch, LUCENE-2701.patch
>
>
> LogMergePolicy allows you to specify a maxMergeSize in MB, which is taken into consideration in regular merges, yet ignored by findMergesForOptimze. I think it'd be good if we take that into consideration even when optimizing. This will allow the caller to specify two constraints: maxNumSegments and maxMergeMB. Obviously both may not be satisfied, and therefore we will guarantee that if there is any segment above the threshold, the threshold constraint takes precedence and therefore you may end up w/ <maxNumSegments (if it's not 1) after optimize. Otherwise, maxNumSegments is taken into consideration.
> As part of this change, I plan to change some methods to protected (from private) and members as well. I realized that if one wishes to implement his own LMP extension, he needs to either put it under o.a.l.index or copy some code over to his impl.
> I'll attach a patch shortly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org