You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by sol myr <so...@gmail.com> on 2011/06/15 14:59:50 UTC

Large index merging/optimization?

Hi,

Our Lucene index grew to about 4 GB .
Unfortunately it brought up a performance problem of slow file merging.
We have:
1. A writer thread: once an Hour it looks for modified documents, and
updates the Lucene index.
Usually there are only few modifications, but sometimes we switch the
entire content and re-index everything.

2. The default Lucene Merge thread (ConcurrentMergeScheduler)

Usually it works great. But every several hours the
'ConcurrentMergeScheduler' thread gets stuck (for hours - I'm guessing
it got to the point where it needs to merge large files).
During this, our Writer thread is stuck (waiting on a lock), so users
will see stale data.

My questions please:

1. Is there any configuration that would either speed up file merging,
or allow IndexWriter to write simultaneously?

2. And when do I call 'optimize'?
Won't it be another very operation, that holds the 'write' lock and
prevents updates?

Thanks:)

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Large index merging/optimization?

Posted by Ian Lea <ia...@gmail.com>.
Waits of several hours on a 4Gb index sounds very unlikely.  Are you
sure there isn't something else going on that is blocking things?
What version of lucene?  Decent, error-free, hardware?

As for optimize, I'd skip it altogether, or schedule it occasionally
when there is no or low activity on the index.


--
Ian.


On Wed, Jun 15, 2011 at 1:59 PM, sol myr <so...@gmail.com> wrote:
> Hi,
>
> Our Lucene index grew to about 4 GB .
> Unfortunately it brought up a performance problem of slow file merging.
> We have:
> 1. A writer thread: once an Hour it looks for modified documents, and
> updates the Lucene index.
> Usually there are only few modifications, but sometimes we switch the
> entire content and re-index everything.
>
> 2. The default Lucene Merge thread (ConcurrentMergeScheduler)
>
> Usually it works great. But every several hours the
> 'ConcurrentMergeScheduler' thread gets stuck (for hours - I'm guessing
> it got to the point where it needs to merge large files).
> During this, our Writer thread is stuck (waiting on a lock), so users
> will see stale data.
>
> My questions please:
>
> 1. Is there any configuration that would either speed up file merging,
> or allow IndexWriter to write simultaneously?
>
> 2. And when do I call 'optimize'?
> Won't it be another very operation, that holds the 'write' lock and
> prevents updates?
>
> Thanks:)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org