You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2011/06/19 00:45:46 UTC

Optimize taking two steps and extra disk space

I've noticed something odd in Solr 3.2 when it does an optimize.  One of 
my shards (freshly built via DIH full-import) had 37 segments, totalling 
17.38GB of disk space.  13 of those segments were results of merges 
during initial import, the other 24 were untouched after creation.  
Starting at _0, the final segment before optimizing is _co.  The 
mergefactor on the index is 35, chosen because it makes merged segments 
line up nicely on "z" boundaries.

The optmization process created a _cp segment of 14.4GB, followed by a 
_cq segment at the final 17.27GB size, so at the peak, it took 49GB of 
disk space to hold the index.

Is there any way to make it do the optimize in one pass?  Is there a 
compelling reason why it does it this way?

Thanks,
Shawn


Re: Optimize taking two steps and extra disk space

Posted by Michael McCandless <lu...@mikemccandless.com>.
OK that sounds like a good solution!

You can also have CMS limit how many merges are allowed to run at
once, if your IO system has trouble w/ that much concurrency.

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 20, 2011 at 6:29 PM, Shawn Heisey <so...@elyograg.org> wrote:
> On 6/20/2011 3:18 PM, Michael McCandless wrote:
>>
>> With segmentsPerTier at 35 you will easily cross 70 segs in the index...
>> If you want optimize to run in a single merge, I would lower
>> sementsPerTier and mergeAtOnce (maybe back to the 10 default), and set
>> your maxMergeAtOnceExplicit to 70 or higher...
>>
>> Lower mergeAtOnce means merges run more frequently but for shorter
>> time, and, your searching should be faster (than 35/35) since there
>> are fewer segments to visit.
>
> Thanks again for more detailed information.  There is method to my madness,
> which I will now try to explain.
>
> With a value of 10, the reindex involves enough merges that there is are
> many second level merges, and a third-level merge.  I was running into
> situations on my development platform (with its slow disks) where there were
> three merges happening at the same time, which caused all indexing activity
> to cease for several minutes.  This in turn would cause JDBC to time out and
> drop the connection to the database, which caused DIH to fail and rollback
> the entire import about two hours (two thirds) in.
>
> With a mergeFactor of 35, there are no second level merges, and no
> third-level merges.  I can do a complete reindex successfully even on a
> system with slow disks.
>
> In production, one shard (out of six) is optimized every day to eliminate
> deleted documents.  When I have to reindex everything, I will typically go
> through and manually optimize each shard in turn after it's done.  This is
> the point where I discovered this two-pass problem.
>
> I don't want to do a full-import with optimize=true, because all six large
> shards build at the same time in a Xen environment.  The I/O storm that
> results from three optimizes happening on each host at the same time and
> then replicating to similar Xen hosts is very bad.
>
> I have now set maxMergeAtOnceExplicit to 105.  I think that is probably
> enough, given that that I currently do not experience any second level
> merges.  When my index gets big enough, I will increase the ram buffer.  By
> then I will probably have more memory, so the first-level merges can still
> happen entirely from I/O cache.
>
> Shawn
>
>

Re: Optimize taking two steps and extra disk space

Posted by Shawn Heisey <so...@elyograg.org>.
On 6/20/2011 3:18 PM, Michael McCandless wrote:
> With segmentsPerTier at 35 you will easily cross 70 segs in the index...
> If you want optimize to run in a single merge, I would lower
> sementsPerTier and mergeAtOnce (maybe back to the 10 default), and set
> your maxMergeAtOnceExplicit to 70 or higher...
>
> Lower mergeAtOnce means merges run more frequently but for shorter
> time, and, your searching should be faster (than 35/35) since there
> are fewer segments to visit.

Thanks again for more detailed information.  There is method to my 
madness, which I will now try to explain.

With a value of 10, the reindex involves enough merges that there is are 
many second level merges, and a third-level merge.  I was running into 
situations on my development platform (with its slow disks) where there 
were three merges happening at the same time, which caused all indexing 
activity to cease for several minutes.  This in turn would cause JDBC to 
time out and drop the connection to the database, which caused DIH to 
fail and rollback the entire import about two hours (two thirds) in.

With a mergeFactor of 35, there are no second level merges, and no 
third-level merges.  I can do a complete reindex successfully even on a 
system with slow disks.

In production, one shard (out of six) is optimized every day to 
eliminate deleted documents.  When I have to reindex everything, I will 
typically go through and manually optimize each shard in turn after it's 
done.  This is the point where I discovered this two-pass problem.

I don't want to do a full-import with optimize=true, because all six 
large shards build at the same time in a Xen environment.  The I/O storm 
that results from three optimizes happening on each host at the same 
time and then replicating to similar Xen hosts is very bad.

I have now set maxMergeAtOnceExplicit to 105.  I think that is probably 
enough, given that that I currently do not experience any second level 
merges.  When my index gets big enough, I will increase the ram buffer.  
By then I will probably have more memory, so the first-level merges can 
still happen entirely from I/O cache.

Shawn


Re: Optimize taking two steps and extra disk space

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Mon, Jun 20, 2011 at 4:00 PM, Shawn Heisey <so...@elyograg.org> wrote:
> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>
>> Actually, TieredMP has two different params (different from the
>> previous default LogMP):
>>
>>   * segmentsPerTier controls how many segments you can tolerate in the
>> index (bigger number means more segments)
>>
>>   * maxMergeAtOnce says how many segments can be merged at a time for
>> "normal" (not optimize) merging
>>
>> For back-compat, mergeFactor maps to both of these, but it's better to
>> set them directly eg:
>>
>>     <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>>       <int name="maxMergeAtOnce">10</int>
>>       <int name="segmentsPerTier">20</int>
>>     </mergePolicy>
>>
>> (and then remove your mergeFactor setting under indexDefaults)
>>
>> You should always have maxMergeAtOnce<= segmentsPerTier else too much
>> merging will happen.
>>
>> If you set segmentsPerTier to 35 than this can easily exceed 70
>> segments, so your optimize will again need more than one merge.  Note
>> that if you make the maxMergeAtOnce/Explicit too large then 1) you
>> risk running out of file handles (if you don't use compound file), and
>> 2) merge performance likely gets worse as the OS is forced to splinter
>> its IO cache across more files (I suspect) and so more seeking will
>> happen.
>
> Thanks much for the information!
>
> I've set my server up so that the user running the index has a soft limit of
> 4096 files and a hard limit of 6144 files, and /proc/sys/fs/file-max is
> 48409, so I should be OK on file handles.  The index is almost twice as big
> as available memory, so I'm not really worried about the I/O cache.  I've
> sized my mergFactor and ramBufferSizeMB so that the individual merges during
> indexing happen entirely from the I/O cache, which is the point where I
> really care about it.  There's nothing I can do about the optimize without
> spending a LOT of money.
>
> I will remove mergeFactor, set maxMergeAtOnce and segmentsPerTier to 35, and
> maxMergeAtOnceExplicit to 70.  If I ever run into a situation where it gets
> beyond 70 segments at any one time, I've probably got bigger problems than
> the number of passes my optimize takes, so I'll think about it then. :)
>  Does that sound reasonable?

With segmentsPerTier at 35 you will easily cross 70 segs in the index...

If you want optimize to run in a single merge, I would lower
sementsPerTier and mergeAtOnce (maybe back to the 10 default), and set
your maxMergeAtOnceExplicit to 70 or higher...

Lower mergeAtOnce means merges run more frequently but for shorter
time, and, your searching should be faster (than 35/35) since there
are fewer segments to visit.

Mike McCandless

http://blog.mikemccandless.com

Re: Optimize taking two steps and extra disk space

Posted by Shawn Heisey <so...@elyograg.org>.
On 6/20/2011 12:31 PM, Michael McCandless wrote:
> Actually, TieredMP has two different params (different from the
> previous default LogMP):
>
>    * segmentsPerTier controls how many segments you can tolerate in the
> index (bigger number means more segments)
>
>    * maxMergeAtOnce says how many segments can be merged at a time for
> "normal" (not optimize) merging
>
> For back-compat, mergeFactor maps to both of these, but it's better to
> set them directly eg:
>
>      <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>        <int name="maxMergeAtOnce">10</int>
>        <int name="segmentsPerTier">20</int>
>      </mergePolicy>
>
> (and then remove your mergeFactor setting under indexDefaults)
>
> You should always have maxMergeAtOnce<= segmentsPerTier else too much
> merging will happen.
>
> If you set segmentsPerTier to 35 than this can easily exceed 70
> segments, so your optimize will again need more than one merge.  Note
> that if you make the maxMergeAtOnce/Explicit too large then 1) you
> risk running out of file handles (if you don't use compound file), and
> 2) merge performance likely gets worse as the OS is forced to splinter
> its IO cache across more files (I suspect) and so more seeking will
> happen.

Thanks much for the information!

I've set my server up so that the user running the index has a soft 
limit of 4096 files and a hard limit of 6144 files, and 
/proc/sys/fs/file-max is 48409, so I should be OK on file handles.  The 
index is almost twice as big as available memory, so I'm not really 
worried about the I/O cache.  I've sized my mergFactor and 
ramBufferSizeMB so that the individual merges during indexing happen 
entirely from the I/O cache, which is the point where I really care 
about it.  There's nothing I can do about the optimize without spending 
a LOT of money.

I will remove mergeFactor, set maxMergeAtOnce and segmentsPerTier to 35, 
and maxMergeAtOnceExplicit to 70.  If I ever run into a situation where 
it gets beyond 70 segments at any one time, I've probably got bigger 
problems than the number of passes my optimize takes, so I'll think 
about it then. :)  Does that sound reasonable?

Shawn


Re: Optimize taking two steps and extra disk space

Posted by Shawn Heisey <so...@elyograg.org>.
On 6/21/2011 9:09 AM, Robert Muir wrote:
> the problem is that before
> https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the
> TieredMergePolicy "setters" *before* it tried to apply these 'global'
> mergeFactor etc params.
>
> So, even if you set them explicitly inside the<mergePolicy>, they
> would then get clobbered by these 'global' params / defaults / etc.
>
> I fixed this order in SOLR-2567 so that the settings inside the
> <mergePolicy>  *always* take precedence, e.g. they are applied last.
>
> So, I think it might be difficult/impossible to configure this MP with
> 3.2 due to this.

That seems to be confirmed by my infostream.  It's using 
LogByteSizeMergePolicy whether I have mergeFactor configured or not.  
The patch for SOLR-2567 applies with fuzz, but the result won't compile.

Unless I can find a way to patch 3.2 to allow using and configuring TMP, 
I guess I'll just have to live with a two-pass optimize.  It only adds a 
few minutes to the process, and I currently have the disk space 
available, so it's not the end of the world.  I am seeing enough 
improvements coming in 3.3 that I will have to lobby for upgrading to it 
a couple of weeks after it gets released.  It won't come out in time for 
this cycle.

Thanks,
Shawn


Re: Optimize taking two steps and extra disk space

Posted by Robert Muir <rc...@gmail.com>.
the problem is that before
https://issues.apache.org/jira/browse/SOLR-2567, Solr invoked the
TieredMergePolicy "setters" *before* it tried to apply these 'global'
mergeFactor etc params.

So, even if you set them explicitly inside the <mergePolicy>, they
would then get clobbered by these 'global' params / defaults / etc.

I fixed this order in SOLR-2567 so that the settings inside the
<mergePolicy> *always* take precedence, e.g. they are applied last.

So, I think it might be difficult/impossible to configure this MP with
3.2 due to this.

On Tue, Jun 21, 2011 at 10:58 AM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey <so...@elyograg.org> wrote:
>> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>>
>>> For back-compat, mergeFactor maps to both of these, but it's better to
>>> set them directly eg:
>>>
>>>     <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>>>       <int name="maxMergeAtOnce">10</int>
>>>       <int name="segmentsPerTier">20</int>
>>>     </mergePolicy>
>>>
>>> (and then remove your mergeFactor setting under indexDefaults)
>>
>> When I did this and ran a reindex, it merged once it reached 10 segments,
>> despite what I had defined in the mergePolicy.  This is Solr 3.2 with the
>> patch from SOLR-1972 applied.  I've included the config snippet below into
>> solrconfig.xml using xinclude via another file.  I had to put mergeFactor
>> back in to make it work right.  I haven't checked yet to see whether an
>> optimize takes one pass.  That will be later today.
>>
>> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>> <int name="maxMergeAtOnce">35</int>
>> <int name="segmentsPerTier">35</int>
>> <int name="maxMergeAtOnceExplicit">105</int>
>> </mergePolicy>
>
> Hmm something strange is going on.
>
> In Solr 3.2, if you attempt to use mergeFactor and useCompoundFile
> inside indexDefaults (and outside the mergePolicy), when your
> mergePolicy is TMP, you should see a warning like this:
>
>  Use of compound file format or mergefactor cannot be configured if
> merge policy is not an instance of LogMergePolicy. The configured
> policy's defaults will be used.
>
> And it shouldn't "work".  But, using the "right" params inside your
> mergePolicy section ought to work (though, I don't think this is well
> tested...).  I'm not sure why you're seeing the opposite of what I'd
> expect...
>
> I wonder if you're actually really getting the TMP?  Can you turn on
> verbose IndexWriter infoStream and post the output?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>

Re: Optimize taking two steps and extra disk space

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Tue, Jun 21, 2011 at 9:42 AM, Shawn Heisey <so...@elyograg.org> wrote:
> On 6/20/2011 12:31 PM, Michael McCandless wrote:
>>
>> For back-compat, mergeFactor maps to both of these, but it's better to
>> set them directly eg:
>>
>>     <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>>       <int name="maxMergeAtOnce">10</int>
>>       <int name="segmentsPerTier">20</int>
>>     </mergePolicy>
>>
>> (and then remove your mergeFactor setting under indexDefaults)
>
> When I did this and ran a reindex, it merged once it reached 10 segments,
> despite what I had defined in the mergePolicy.  This is Solr 3.2 with the
> patch from SOLR-1972 applied.  I've included the config snippet below into
> solrconfig.xml using xinclude via another file.  I had to put mergeFactor
> back in to make it work right.  I haven't checked yet to see whether an
> optimize takes one pass.  That will be later today.
>
> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
> <int name="maxMergeAtOnce">35</int>
> <int name="segmentsPerTier">35</int>
> <int name="maxMergeAtOnceExplicit">105</int>
> </mergePolicy>

Hmm something strange is going on.

In Solr 3.2, if you attempt to use mergeFactor and useCompoundFile
inside indexDefaults (and outside the mergePolicy), when your
mergePolicy is TMP, you should see a warning like this:

  Use of compound file format or mergefactor cannot be configured if
merge policy is not an instance of LogMergePolicy. The configured
policy's defaults will be used.

And it shouldn't "work".  But, using the "right" params inside your
mergePolicy section ought to work (though, I don't think this is well
tested...).  I'm not sure why you're seeing the opposite of what I'd
expect...

I wonder if you're actually really getting the TMP?  Can you turn on
verbose IndexWriter infoStream and post the output?

Mike McCandless

http://blog.mikemccandless.com

Re: Optimize taking two steps and extra disk space

Posted by Shawn Heisey <so...@elyograg.org>.
On 6/20/2011 12:31 PM, Michael McCandless wrote:
> For back-compat, mergeFactor maps to both of these, but it's better to
> set them directly eg:
>
>      <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>        <int name="maxMergeAtOnce">10</int>
>        <int name="segmentsPerTier">20</int>
>      </mergePolicy>
>
> (and then remove your mergeFactor setting under indexDefaults)

When I did this and ran a reindex, it merged once it reached 10 
segments, despite what I had defined in the mergePolicy.  This is Solr 
3.2 with the patch from SOLR-1972 applied.  I've included the config 
snippet below into solrconfig.xml using xinclude via another file.  I 
had to put mergeFactor back in to make it work right.  I haven't checked 
yet to see whether an optimize takes one pass.  That will be later today.

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
<int name="maxMergeAtOnce">35</int>
<int name="segmentsPerTier">35</int>
<int name="maxMergeAtOnceExplicit">105</int>
</mergePolicy>

Shawn


Re: Optimize taking two steps and extra disk space

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Sun, Jun 19, 2011 at 12:35 PM, Shawn Heisey <so...@elyograg.org> wrote:
> On 6/19/2011 7:32 AM, Michael McCandless wrote:
>>
>> With LogXMergePolicy (the default before 3.2), optimize respects
>> mergeFactor, so it's doing 2 steps because you have 37 segments but 35
>> mergeFactor.
>>
>> With TieredMergePolicy (default on 3.2 and after), there is now a
>> separate merge factor used for optimize (maxMergeAtOnceExplicit)... so
>> you could eg set this factor higher and more often get a single merge
>> for the optimize.
>
> This makes sense.  the default for maxMergeAtOnceExplicit is 30 according to
> LUCENE-854, so it merges the first 30 segments, then it goes back and merges
> the new one plus the other 7 that remain.  To counteract this behavior, I've
> put this in my solrconfig.xml, to test next week.
>
> <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
> <int name="maxMergeAtOnceExplicit">70</int>
> </mergePolicy>
>
> I figure that twice the megeFactor (35) will likely cover every possible
> outcome.  Is that a correct thought?

Actually, TieredMP has two different params (different from the
previous default LogMP):

  * segmentsPerTier controls how many segments you can tolerate in the
index (bigger number means more segments)

  * maxMergeAtOnce says how many segments can be merged at a time for
"normal" (not optimize) merging

For back-compat, mergeFactor maps to both of these, but it's better to
set them directly eg:

    <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
      <int name="maxMergeAtOnce">10</int>
      <int name="segmentsPerTier">20</int>
    </mergePolicy>

(and then remove your mergeFactor setting under indexDefaults)

You should always have maxMergeAtOnce <= segmentsPerTier else too much
merging will happen.

If you set segmentsPerTier to 35 than this can easily exceed 70
segments, so your optimize will again need more than one merge.  Note
that if you make the maxMergeAtOnce/Explicit too large then 1) you
risk running out of file handles (if you don't use compound file), and
2) merge performance likely gets worse as the OS is forced to splinter
its IO cache across more files (I suspect) and so more seeking will
happen.

Mike McCandless

http://blog.mikemccandless.com

Re: Optimize taking two steps and extra disk space

Posted by Shawn Heisey <so...@elyograg.org>.
On 6/19/2011 7:32 AM, Michael McCandless wrote:
> With LogXMergePolicy (the default before 3.2), optimize respects
> mergeFactor, so it's doing 2 steps because you have 37 segments but 35
> mergeFactor.
>
> With TieredMergePolicy (default on 3.2 and after), there is now a
> separate merge factor used for optimize (maxMergeAtOnceExplicit)... so
> you could eg set this factor higher and more often get a single merge
> for the optimize.

This makes sense.  the default for maxMergeAtOnceExplicit is 30 
according to LUCENE-854, so it merges the first 30 segments, then it 
goes back and merges the new one plus the other 7 that remain.  To 
counteract this behavior, I've put this in my solrconfig.xml, to test 
next week.

<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
<int name="maxMergeAtOnceExplicit">70</int>
</mergePolicy>

I figure that twice the megeFactor (35) will likely cover every possible 
outcome.  Is that a correct thought?

Thanks,
Shawn


Re: Optimize taking two steps and extra disk space

Posted by Michael McCandless <lu...@mikemccandless.com>.
With LogXMergePolicy (the default before 3.2), optimize respects
mergeFactor, so it's doing 2 steps because you have 37 segments but 35
mergeFactor.

With TieredMergePolicy (default on 3.2 and after), there is now a
separate merge factor used for optimize (maxMergeAtOnceExplicit)... so
you could eg set this factor higher and more often get a single merge
for the optimize.

Mike McCandless

http://blog.mikemccandless.com

On Sat, Jun 18, 2011 at 6:45 PM, Shawn Heisey <so...@elyograg.org> wrote:
> I've noticed something odd in Solr 3.2 when it does an optimize.  One of my
> shards (freshly built via DIH full-import) had 37 segments, totalling
> 17.38GB of disk space.  13 of those segments were results of merges during
> initial import, the other 24 were untouched after creation.  Starting at _0,
> the final segment before optimizing is _co.  The mergefactor on the index is
> 35, chosen because it makes merged segments line up nicely on "z"
> boundaries.
>
> The optmization process created a _cp segment of 14.4GB, followed by a _cq
> segment at the final 17.27GB size, so at the peak, it took 49GB of disk
> space to hold the index.
>
> Is there any way to make it do the optimize in one pass?  Is there a
> compelling reason why it does it this way?
>
> Thanks,
> Shawn
>
>