You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jürgen Albert <j....@data-in-motion.biz> on 2015/01/16 14:57:02 UTC

forceMerge(1) grows index and does not shrink back

Hi,

because we have constant updates on our index, we can't really close the 
index from time to time. Therefore we decided to trigger forceMerge  
when the traffic is lowest, the clean up.

On our development laptops (Windows and Linux) it works as expected, but 
on the real Servers we have some wired behaviour.

Scenario:

We create a fresh index and populate it. This results in an index with a 
size of 2 GB. If we rigger forceMerge(1) and a commit() afterwards for 
this index, the index grows over the next 10 minutes to 6 GB and does 
not shrink back. During the whole process no reader is opened on the index.
If I try the same stunt with the same data on my Windows Laptop, it does 
nothing at all and finishes after a few ms.

Any Ideas?

Technical details:
We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a KVM. 
The file system is Ext4.

Thx,

Jürgen Albert.

-- 
Jürgen Albert
Geschäftsführer

Data In Motion UG (haftungsbeschränkt)

Kahlaische Str. 4
07745 Jena

Mobil:  0157-72521634
E-Mail: j.albert@datainmotion.de
Web: www.datainmotion.de

XING:   https://www.xing.com/profile/Juergen_Albert5

Rechtliches

Jena HBR 507027
USt-IdNr: DE274553639
St.Nr.: 162/107/04586


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: forceMerge(1) grows index and does not shrink back

Posted by Ian Lea <ia...@gmail.com>.
Unclosed readers can definitely cause problems with index size, by
preventing the deletion of merged-away segments.  lsof can be useful
for diagnosing that.

As to the rest, I for one have lost track of what problems you've got
with which of your indexes.  I suggest you remove the forceMerge call,
double check for unclosed readers or anything else hanging on to index
files, then post a new message if you've still got problems.


--
Ian.


On Mon, Jan 19, 2015 at 2:16 PM, Jürgen Albert
<j....@data-in-motion.biz> wrote:
> Hi,
>
> Am 19.01.2015 um 14:13 schrieb Uwe Schindler:
>>
>> Hi,
>>
>>> we use 4.8.1. We know that the javadoc advises against it. Like I wrote,
>>> the
>>> deletion of old documents (that appear during an update) would be done
>>> while closing the writer.
>>
>> This is not true. The merge policy continuously merges segments that
>> contain deletions. The problem you might have is the following:
>> If you call forceMerge(1) for the first time, your index is reduced from a
>> well distributed multi-segment index to one single, large segment. If you
>> then apply deletes, they are applied against this large segment. Newly added
>> documents are added to new segments. Those new segments are small, so they
>> are merged with preference. The deletions in the huge single segment are
>> very unlikely merged away, because Lucene only touches this segment as a
>> large resort. So the problem starts when you call forceMerge for the first
>> time!
>>
>> If you don’t call forceMerge and continuously index, you deletions will be
>> removed quite fast. This is especially true if the deletions are
>> well-distributed over the whole index! There are tons of instances with
>> Elasticsearch and Lucene doing this all the time. They never ever close
>> their writer. Be sure to use TieredMergePolicy (the default), because this
>> one prefers segments that have many deletions. The old LogMergePolicy does
>> not respect deletes, but should no longer be used, unless you rely on a
>> specific index order of your documents.
>
> We use the default, which is the TieredMergePolicy as far as I can see. If
> what you write is true, I wonder why our index started growing in the first
> place. We have 2 indices, where the bigger one receives an update on every
> document every couple of days and a smaller one where every document is
> updated randomly over a period of roughly 3 minutes. After a couple of days,
> the indices became 12 GB each (the bigger one started with 2 GB and the
> smaller one with a couple of Megabytes). This should not happen if the
> MergePolicy works as intended. Can unclosed readers cause such a problem. We
> use a SearchManager to avoid this, but there can always be the possibility.
>
> On the other hand we have the case I initially described. We have a fresh
> index, that we populate. No reader is opened and no additional updates have
> been made. Therefore I see no reason why forceMerge triples the size of the
> index at all.
>>>
>>> Unfortunately we can't close the writer and we
>>> chose the force merge as alternative with less afford. Could
>>> forceMergeDeletes serve our purpose here?
>>
>> It could, but has the same problem like above. The only difference to
>> forceMerge is that it only merges segments which have deletions.
>>
>>> I will take a look into it with lsof, but I'm pretty sure, the files will
>>> be held by
>>> some javaprocess.
>>>
>>> Jürgen.
>>>
>>> Am 19.01.2015 um 13:36 schrieb Ian Lea:
>>>>
>>>> Do you need to call forceMerge(1) at all?  The javadoc, certainly for
>>>> recent versions of lucene, advises against it.  What version of lucene
>>>> are you running?
>>>>
>>>> It might be helpful to run lsof against the index directory
>>>> before/during/after the merge to see what files are coming or going,
>>>> or if there are any marked as deleted but still present.  That would
>>>> imply that something, somewhere, was holding on to the files.
>>>>
>>>>
>>>> --
>>>> Ian.
>>>>
>>>>
>>>> On Fri, Jan 16, 2015 at 1:57 PM, Jürgen Albert
>>>> <j....@data-in-motion.biz> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> because we have constant updates on our index, we can't really close
>>>>> the index from time to time. Therefore we decided to trigger
>>>>> forceMerge  when the traffic is lowest, the clean up.
>>>>>
>>>>> On our development laptops (Windows and Linux) it works as expected,
>>>>> but on the real Servers we have some wired behaviour.
>>>>>
>>>>> Scenario:
>>>>>
>>>>> We create a fresh index and populate it. This results in an index
>>>>> with a size of 2 GB. If we rigger forceMerge(1) and a commit()
>>>>> afterwards for this index, the index grows over the next 10 minutes
>>>>> to 6 GB and does not shrink back. During the whole process no reader is
>>>
>>> opened on the index.
>>>>>
>>>>> If I try the same stunt with the same data on my Windows Laptop, it
>>>>> does nothing at all and finishes after a few ms.
>>>>>
>>>>> Any Ideas?
>>>>>
>>>>> Technical details:
>>>>> We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a
>>>>> KVM. The file system is Ext4.
>>>>>
>>>>> Thx,
>>>>>
>>>>> Jürgen Albert.
>>>>>
>>>>> --
>>>>> Jürgen Albert
>>>>> Geschäftsführer
>>>>>
>>>>> Data In Motion UG (haftungsbeschränkt)
>>>>>
>>>>> Kahlaische Str. 4
>>>>> 07745 Jena
>>>>>
>>>>> Mobil:  0157-72521634
>>>>> E-Mail: j.albert@datainmotion.de
>>>>> Web: www.datainmotion.de
>>>>>
>>>>> XING:   https://www.xing.com/profile/Juergen_Albert5
>>>>>
>>>>> Rechtliches
>>>>>
>>>>> Jena HBR 507027
>>>>> USt-IdNr: DE274553639
>>>>> St.Nr.: 162/107/04586
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> --
>>> Jürgen Albert
>>> Geschäftsführer
>>>
>>> Data In Motion UG (haftungsbeschränkt)
>>>
>>> Kahlaische Str. 4
>>> 07745 Jena
>>>
>>> Mobil:  0157-72521634
>>> E-Mail: j.albert@datainmotion.de
>>> Web: www.datainmotion.de
>>>
>>> XING:   https://www.xing.com/profile/Juergen_Albert5
>>>
>>> Rechtliches
>>>
>>> Jena HBR 507027
>>> USt-IdNr: DE274553639
>>> St.Nr.: 162/107/04586
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> --
> Jürgen Albert
> Geschäftsführer
>
> Data In Motion UG (haftungsbeschränkt)
>
> Kahlaische Str. 4
> 07745 Jena
>
> Mobil:  0157-72521634
> E-Mail: j.albert@datainmotion.de
> Web: www.datainmotion.de
>
> XING:   https://www.xing.com/profile/Juergen_Albert5
>
> Rechtliches
>
> Jena HBR 507027
> USt-IdNr: DE274553639
> St.Nr.: 162/107/04586
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: forceMerge(1) grows index and does not shrink back

Posted by Jürgen Albert <j....@data-in-motion.biz>.
Hi,

Am 19.01.2015 um 14:13 schrieb Uwe Schindler:
> Hi,
>
>> we use 4.8.1. We know that the javadoc advises against it. Like I wrote, the
>> deletion of old documents (that appear during an update) would be done
>> while closing the writer.
> This is not true. The merge policy continuously merges segments that contain deletions. The problem you might have is the following:
> If you call forceMerge(1) for the first time, your index is reduced from a well distributed multi-segment index to one single, large segment. If you then apply deletes, they are applied against this large segment. Newly added documents are added to new segments. Those new segments are small, so they are merged with preference. The deletions in the huge single segment are very unlikely merged away, because Lucene only touches this segment as a large resort. So the problem starts when you call forceMerge for the first time!
>
> If you don’t call forceMerge and continuously index, you deletions will be removed quite fast. This is especially true if the deletions are well-distributed over the whole index! There are tons of instances with Elasticsearch and Lucene doing this all the time. They never ever close their writer. Be sure to use TieredMergePolicy (the default), because this one prefers segments that have many deletions. The old LogMergePolicy does not respect deletes, but should no longer be used, unless you rely on a specific index order of your documents.
We use the default, which is the TieredMergePolicy as far as I can see. 
If what you write is true, I wonder why our index started growing in the 
first place. We have 2 indices, where the bigger one receives an update 
on every document every couple of days and a smaller one where every 
document is updated randomly over a period of roughly 3 minutes. After a 
couple of days, the indices became 12 GB each (the bigger one started 
with 2 GB and the smaller one with a couple of Megabytes). This should 
not happen if the MergePolicy works as intended. Can unclosed readers 
cause such a problem. We use a SearchManager to avoid this, but there 
can always be the possibility.

On the other hand we have the case I initially described. We have a 
fresh index, that we populate. No reader is opened and no additional 
updates have been made. Therefore I see no reason why forceMerge triples 
the size of the index at all.
>> Unfortunately we can't close the writer and we
>> chose the force merge as alternative with less afford. Could
>> forceMergeDeletes serve our purpose here?
> It could, but has the same problem like above. The only difference to forceMerge is that it only merges segments which have deletions.
>
>> I will take a look into it with lsof, but I'm pretty sure, the files will be held by
>> some javaprocess.
>>
>> Jürgen.
>>
>> Am 19.01.2015 um 13:36 schrieb Ian Lea:
>>> Do you need to call forceMerge(1) at all?  The javadoc, certainly for
>>> recent versions of lucene, advises against it.  What version of lucene
>>> are you running?
>>>
>>> It might be helpful to run lsof against the index directory
>>> before/during/after the merge to see what files are coming or going,
>>> or if there are any marked as deleted but still present.  That would
>>> imply that something, somewhere, was holding on to the files.
>>>
>>>
>>> --
>>> Ian.
>>>
>>>
>>> On Fri, Jan 16, 2015 at 1:57 PM, Jürgen Albert
>>> <j....@data-in-motion.biz> wrote:
>>>> Hi,
>>>>
>>>> because we have constant updates on our index, we can't really close
>>>> the index from time to time. Therefore we decided to trigger
>>>> forceMerge  when the traffic is lowest, the clean up.
>>>>
>>>> On our development laptops (Windows and Linux) it works as expected,
>>>> but on the real Servers we have some wired behaviour.
>>>>
>>>> Scenario:
>>>>
>>>> We create a fresh index and populate it. This results in an index
>>>> with a size of 2 GB. If we rigger forceMerge(1) and a commit()
>>>> afterwards for this index, the index grows over the next 10 minutes
>>>> to 6 GB and does not shrink back. During the whole process no reader is
>> opened on the index.
>>>> If I try the same stunt with the same data on my Windows Laptop, it
>>>> does nothing at all and finishes after a few ms.
>>>>
>>>> Any Ideas?
>>>>
>>>> Technical details:
>>>> We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a
>>>> KVM. The file system is Ext4.
>>>>
>>>> Thx,
>>>>
>>>> Jürgen Albert.
>>>>
>>>> --
>>>> Jürgen Albert
>>>> Geschäftsführer
>>>>
>>>> Data In Motion UG (haftungsbeschränkt)
>>>>
>>>> Kahlaische Str. 4
>>>> 07745 Jena
>>>>
>>>> Mobil:  0157-72521634
>>>> E-Mail: j.albert@datainmotion.de
>>>> Web: www.datainmotion.de
>>>>
>>>> XING:   https://www.xing.com/profile/Juergen_Albert5
>>>>
>>>> Rechtliches
>>>>
>>>> Jena HBR 507027
>>>> USt-IdNr: DE274553639
>>>> St.Nr.: 162/107/04586
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>> --
>> Jürgen Albert
>> Geschäftsführer
>>
>> Data In Motion UG (haftungsbeschränkt)
>>
>> Kahlaische Str. 4
>> 07745 Jena
>>
>> Mobil:  0157-72521634
>> E-Mail: j.albert@datainmotion.de
>> Web: www.datainmotion.de
>>
>> XING:   https://www.xing.com/profile/Juergen_Albert5
>>
>> Rechtliches
>>
>> Jena HBR 507027
>> USt-IdNr: DE274553639
>> St.Nr.: 162/107/04586
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


-- 
Jürgen Albert
Geschäftsführer

Data In Motion UG (haftungsbeschränkt)

Kahlaische Str. 4
07745 Jena

Mobil:  0157-72521634
E-Mail: j.albert@datainmotion.de
Web: www.datainmotion.de

XING:   https://www.xing.com/profile/Juergen_Albert5

Rechtliches

Jena HBR 507027
USt-IdNr: DE274553639
St.Nr.: 162/107/04586


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: forceMerge(1) grows index and does not shrink back

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

> we use 4.8.1. We know that the javadoc advises against it. Like I wrote, the
> deletion of old documents (that appear during an update) would be done
> while closing the writer.

This is not true. The merge policy continuously merges segments that contain deletions. The problem you might have is the following:
If you call forceMerge(1) for the first time, your index is reduced from a well distributed multi-segment index to one single, large segment. If you then apply deletes, they are applied against this large segment. Newly added documents are added to new segments. Those new segments are small, so they are merged with preference. The deletions in the huge single segment are very unlikely merged away, because Lucene only touches this segment as a large resort. So the problem starts when you call forceMerge for the first time!

If you don’t call forceMerge and continuously index, you deletions will be removed quite fast. This is especially true if the deletions are well-distributed over the whole index! There are tons of instances with Elasticsearch and Lucene doing this all the time. They never ever close their writer. Be sure to use TieredMergePolicy (the default), because this one prefers segments that have many deletions. The old LogMergePolicy does not respect deletes, but should no longer be used, unless you rely on a specific index order of your documents.

> Unfortunately we can't close the writer and we
> chose the force merge as alternative with less afford. Could
> forceMergeDeletes serve our purpose here?

It could, but has the same problem like above. The only difference to forceMerge is that it only merges segments which have deletions.

> I will take a look into it with lsof, but I'm pretty sure, the files will be held by
> some javaprocess.
> 
> Jürgen.
> 
> Am 19.01.2015 um 13:36 schrieb Ian Lea:
> > Do you need to call forceMerge(1) at all?  The javadoc, certainly for
> > recent versions of lucene, advises against it.  What version of lucene
> > are you running?
> >
> > It might be helpful to run lsof against the index directory
> > before/during/after the merge to see what files are coming or going,
> > or if there are any marked as deleted but still present.  That would
> > imply that something, somewhere, was holding on to the files.
> >
> >
> > --
> > Ian.
> >
> >
> > On Fri, Jan 16, 2015 at 1:57 PM, Jürgen Albert
> > <j....@data-in-motion.biz> wrote:
> >> Hi,
> >>
> >> because we have constant updates on our index, we can't really close
> >> the index from time to time. Therefore we decided to trigger
> >> forceMerge  when the traffic is lowest, the clean up.
> >>
> >> On our development laptops (Windows and Linux) it works as expected,
> >> but on the real Servers we have some wired behaviour.
> >>
> >> Scenario:
> >>
> >> We create a fresh index and populate it. This results in an index
> >> with a size of 2 GB. If we rigger forceMerge(1) and a commit()
> >> afterwards for this index, the index grows over the next 10 minutes
> >> to 6 GB and does not shrink back. During the whole process no reader is
> opened on the index.
> >> If I try the same stunt with the same data on my Windows Laptop, it
> >> does nothing at all and finishes after a few ms.
> >>
> >> Any Ideas?
> >>
> >> Technical details:
> >> We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a
> >> KVM. The file system is Ext4.
> >>
> >> Thx,
> >>
> >> Jürgen Albert.
> >>
> >> --
> >> Jürgen Albert
> >> Geschäftsführer
> >>
> >> Data In Motion UG (haftungsbeschränkt)
> >>
> >> Kahlaische Str. 4
> >> 07745 Jena
> >>
> >> Mobil:  0157-72521634
> >> E-Mail: j.albert@datainmotion.de
> >> Web: www.datainmotion.de
> >>
> >> XING:   https://www.xing.com/profile/Juergen_Albert5
> >>
> >> Rechtliches
> >>
> >> Jena HBR 507027
> >> USt-IdNr: DE274553639
> >> St.Nr.: 162/107/04586
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> 
> --
> Jürgen Albert
> Geschäftsführer
> 
> Data In Motion UG (haftungsbeschränkt)
> 
> Kahlaische Str. 4
> 07745 Jena
> 
> Mobil:  0157-72521634
> E-Mail: j.albert@datainmotion.de
> Web: www.datainmotion.de
> 
> XING:   https://www.xing.com/profile/Juergen_Albert5
> 
> Rechtliches
> 
> Jena HBR 507027
> USt-IdNr: DE274553639
> St.Nr.: 162/107/04586
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: forceMerge(1) grows index and does not shrink back

Posted by Jürgen Albert <j....@data-in-motion.biz>.
Hi,

we use 4.8.1. We know that the javadoc advises against it. Like I wrote, 
the deletion of old documents (that appear during an update) would be 
done while closing the writer. Unfortunately we can't close the writer 
and we chose the force merge as alternative with less afford. Could 
forceMergeDeletes serve our purpose here?

I will take a look into it with lsof, but I'm pretty sure, the files 
will be held by some javaprocess.

Jürgen.

Am 19.01.2015 um 13:36 schrieb Ian Lea:
> Do you need to call forceMerge(1) at all?  The javadoc, certainly for
> recent versions of lucene, advises against it.  What version of lucene
> are you running?
>
> It might be helpful to run lsof against the index directory
> before/during/after the merge to see what files are coming or going,
> or if there are any marked as deleted but still present.  That would
> imply that something, somewhere, was holding on to the files.
>
>
> --
> Ian.
>
>
> On Fri, Jan 16, 2015 at 1:57 PM, Jürgen Albert
> <j....@data-in-motion.biz> wrote:
>> Hi,
>>
>> because we have constant updates on our index, we can't really close the
>> index from time to time. Therefore we decided to trigger forceMerge  when
>> the traffic is lowest, the clean up.
>>
>> On our development laptops (Windows and Linux) it works as expected, but on
>> the real Servers we have some wired behaviour.
>>
>> Scenario:
>>
>> We create a fresh index and populate it. This results in an index with a
>> size of 2 GB. If we rigger forceMerge(1) and a commit() afterwards for this
>> index, the index grows over the next 10 minutes to 6 GB and does not shrink
>> back. During the whole process no reader is opened on the index.
>> If I try the same stunt with the same data on my Windows Laptop, it does
>> nothing at all and finishes after a few ms.
>>
>> Any Ideas?
>>
>> Technical details:
>> We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a KVM. The
>> file system is Ext4.
>>
>> Thx,
>>
>> Jürgen Albert.
>>
>> --
>> Jürgen Albert
>> Geschäftsführer
>>
>> Data In Motion UG (haftungsbeschränkt)
>>
>> Kahlaische Str. 4
>> 07745 Jena
>>
>> Mobil:  0157-72521634
>> E-Mail: j.albert@datainmotion.de
>> Web: www.datainmotion.de
>>
>> XING:   https://www.xing.com/profile/Juergen_Albert5
>>
>> Rechtliches
>>
>> Jena HBR 507027
>> USt-IdNr: DE274553639
>> St.Nr.: 162/107/04586
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


-- 
Jürgen Albert
Geschäftsführer

Data In Motion UG (haftungsbeschränkt)

Kahlaische Str. 4
07745 Jena

Mobil:  0157-72521634
E-Mail: j.albert@datainmotion.de
Web: www.datainmotion.de

XING:   https://www.xing.com/profile/Juergen_Albert5

Rechtliches

Jena HBR 507027
USt-IdNr: DE274553639
St.Nr.: 162/107/04586


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: forceMerge(1) grows index and does not shrink back

Posted by Ian Lea <ia...@gmail.com>.
Do you need to call forceMerge(1) at all?  The javadoc, certainly for
recent versions of lucene, advises against it.  What version of lucene
are you running?

It might be helpful to run lsof against the index directory
before/during/after the merge to see what files are coming or going,
or if there are any marked as deleted but still present.  That would
imply that something, somewhere, was holding on to the files.


--
Ian.


On Fri, Jan 16, 2015 at 1:57 PM, Jürgen Albert
<j....@data-in-motion.biz> wrote:
> Hi,
>
> because we have constant updates on our index, we can't really close the
> index from time to time. Therefore we decided to trigger forceMerge  when
> the traffic is lowest, the clean up.
>
> On our development laptops (Windows and Linux) it works as expected, but on
> the real Servers we have some wired behaviour.
>
> Scenario:
>
> We create a fresh index and populate it. This results in an index with a
> size of 2 GB. If we rigger forceMerge(1) and a commit() afterwards for this
> index, the index grows over the next 10 minutes to 6 GB and does not shrink
> back. During the whole process no reader is opened on the index.
> If I try the same stunt with the same data on my Windows Laptop, it does
> nothing at all and finishes after a few ms.
>
> Any Ideas?
>
> Technical details:
> We use an MMapDirectory and the Server is a Debian7 Kernel 3.2 in a KVM. The
> file system is Ext4.
>
> Thx,
>
> Jürgen Albert.
>
> --
> Jürgen Albert
> Geschäftsführer
>
> Data In Motion UG (haftungsbeschränkt)
>
> Kahlaische Str. 4
> 07745 Jena
>
> Mobil:  0157-72521634
> E-Mail: j.albert@datainmotion.de
> Web: www.datainmotion.de
>
> XING:   https://www.xing.com/profile/Juergen_Albert5
>
> Rechtliches
>
> Jena HBR 507027
> USt-IdNr: DE274553639
> St.Nr.: 162/107/04586
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org