You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Zheng Lin Edwin Yeo <ed...@gmail.com> on 2019/01/28 17:14:47 UTC
Number of segments in collection is more than what is set in TieredMergePolicyFactory
Hi,
We have the following TieredMergePolicyFactory configuration in our
solrconfig,xml
<mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
<int name="maxMergeAtOnce">10</int>
<int name="maxMergeAtOnceExplicit">10</int>
<int name="segmentsPerTier">10</int>
<int name="floorSegmentMB">10</int>
<int name="maxMergedSegmentMB">5120</int>
<double name="noCFSRatio">0.1</double>
<int name="maxCFSSegmentSizeMB">2048</int>
<double name="forceMergeDeletesPctAllowed">10.0</double>
</mergePolicyFactory>
However, when we index data to the collection, the number of segments that
we are getting does not match what we configured.
For example, our collection size is 13.7 GB. With the above
TieredMergePolicyFactory configuration, we should expect to have 3 segments
(since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24
segments in our collection, which we have attached the screenshot in the
link below.
https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing
What could be the reason that it is not able to merge the segments to 3,
with each of the segment size to be 5 GB?
Regards,
Edwin
Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory
Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi Shawn,
Thank you for the explanation.
Regards,
Edwin
On Wed, 30 Jan 2019 at 15:18, Shawn Heisey <ap...@elyograg.org> wrote:
> On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote:
> > We have the following TieredMergePolicyFactory configuration in our
> > solrconfig,xml
> >
> > <mergePolicyFactory
> class="org.apache.solr.index.TieredMergePolicyFactory">
> > <int name="maxMergeAtOnce">10</int>
> > <int name="maxMergeAtOnceExplicit">10</int>
> > <int name="segmentsPerTier">10</int>
>
> These three settings are the really important ones. Except for
> maxMergeAtOnceExplicit, you have these at the default settings. The
> default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it
> without a really good reason. It mostly comes into play during an
> optimize ... when you lower it, optimizes may take longer than normal.
> It won't be able to merge as many segments at the same time, so the
> number of passes required to complete the optimize could increase.
>
> The most important setting here is segmentsPerTier ... this does not
> mean you will never have more than 10 total segments, it means that at
> each tier, Lucene will try to keep the number of segments below 10.
> With a large index, you are likely to have 3 or 4 tiers, possibly more.
>
> On an index where I spent a lot of time, my settings were, respective to
> yours, 35, 105, and 35. I often had more than 100 segments in those
> indexes. It was behaving correctly.
>
> > What could be the reason that it is not able to merge the segments to 3,
> > with each of the segment size to be 5 GB?
>
> It is working as designed, just not as you expected.
>
> Thanks,
> Shawn
>
Re: Number of segments in collection is more than what is set in
TieredMergePolicyFactory
Posted by Shawn Heisey <ap...@elyograg.org>.
On 1/28/2019 10:14 AM, Zheng Lin Edwin Yeo wrote:
> We have the following TieredMergePolicyFactory configuration in our
> solrconfig,xml
>
> <mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
> <int name="maxMergeAtOnce">10</int>
> <int name="maxMergeAtOnceExplicit">10</int>
> <int name="segmentsPerTier">10</int>
These three settings are the really important ones. Except for
maxMergeAtOnceExplicit, you have these at the default settings. The
default for maxMergeAtOnceExplicit is 30 ... and you shouldn't lower it
without a really good reason. It mostly comes into play during an
optimize ... when you lower it, optimizes may take longer than normal.
It won't be able to merge as many segments at the same time, so the
number of passes required to complete the optimize could increase.
The most important setting here is segmentsPerTier ... this does not
mean you will never have more than 10 total segments, it means that at
each tier, Lucene will try to keep the number of segments below 10.
With a large index, you are likely to have 3 or 4 tiers, possibly more.
On an index where I spent a lot of time, my settings were, respective to
yours, 35, 105, and 35. I often had more than 100 segments in those
indexes. It was behaving correctly.
> What could be the reason that it is not able to merge the segments to 3,
> with each of the segment size to be 5 GB?
It is working as designed, just not as you expected.
Thanks,
Shawn
Re: Number of segments in collection is more than what is set in TieredMergePolicyFactory
Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi,
Anyone has any insights of this?
Thank you in advance.
Regards,
Edwin
On Tue, 29 Jan 2019 at 01:14, Zheng Lin Edwin Yeo <ed...@gmail.com>
wrote:
> Hi,
>
> We have the following TieredMergePolicyFactory configuration in our
> solrconfig,xml
>
> <mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
> <int name="maxMergeAtOnce">10</int>
> <int name="maxMergeAtOnceExplicit">10</int>
> <int name="segmentsPerTier">10</int>
> <int name="floorSegmentMB">10</int>
> <int name="maxMergedSegmentMB">5120</int>
> <double name="noCFSRatio">0.1</double>
> <int name="maxCFSSegmentSizeMB">2048</int>
> <double name="forceMergeDeletesPctAllowed">10.0</double>
> </mergePolicyFactory>
>
> However, when we index data to the collection, the number of segments that
> we are getting does not match what we configured.
> For example, our collection size is 13.7 GB. With the above
> TieredMergePolicyFactory configuration, we should expect to have 3 segments
> (since 13.7 / 5 = 2.74, which rounds up to 3). But we are getting 24
> segments in our collection, which we have attached the screenshot in the
> link below.
>
> https://drive.google.com/file/d/1hjIQVk_L2Bn9MYOmCdf2wKD_f_D2DNV6/view?usp=sharing
>
> What could be the reason that it is not able to merge the segments to 3,
> with each of the segment size to be 5 GB?
>
> Regards,
> Edwin
>
>
>
>