You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by aurora <au...@gmail.com> on 2004/12/21 19:32:58 UTC

how often to optimize?

Right now I am incrementally adding about 100 documents to the index a day  
and then optimize after that. I find that optimize essentially rebuilding  
the entire index into a single file. So the size of disk write is  
proportion to the total index size, not to the size of documents  
incrementally added.

So my question is would it be an overkill to optimize everyday? Is there  
any guideline on how often to optimize? Every 1000 documents or more?  
Every week? Is there any concern if there are a lot of documents added  
without optimizing?

Thanks.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: how often to optimize?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Correct.
The self-maintenance you are referring to is Lucene's periodic segment
merging.  The frequency of that can be controlled through IndexWriter's
mergeFactor.

Otis

--- aurora <au...@gmail.com> wrote:

> > Are not optimized indices causing you any problems (e.g. slow
> searches,
> > high number of open file handles)?  If no, then you don't even need
> to
> > optimize until those issues become... issues.
> >
> 
> OK I have changed the process to not doing optimize() at all. So far
> so  
> good. The number of files hover from 10 to 40 during the indexing of 
> 
> 10,000 files. Seems Lucene is doing some kind of self maintenance to
> keep  
> things in order.
> 
> Is it right to say optimize() is a totally optional operation? I
> probably  
> get the impression it is a natural step to end an incremental update
> from  
> the IndexHTML example. Since it replicates the whole index it might
> be an  
> overkill for many applications to do daily.
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: how often to optimize?

Posted by aurora <au...@gmail.com>.
> Are not optimized indices causing you any problems (e.g. slow searches,
> high number of open file handles)?  If no, then you don't even need to
> optimize until those issues become... issues.
>

OK I have changed the process to not doing optimize() at all. So far so  
good. The number of files hover from 10 to 40 during the indexing of  
10,000 files. Seems Lucene is doing some kind of self maintenance to keep  
things in order.

Is it right to say optimize() is a totally optional operation? I probably  
get the impression it is a natural step to end an incremental update from  
the IndexHTML example. Since it replicates the whole index it might be an  
overkill for many applications to do daily.




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: how often to optimize?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hello,

I think some of these questions maaaay be answered in the jGuru FAQ

> So my question is would it be an overkill to optimize everyday?

Only if lots of documents are being added/deleted, and you end up with
a lot of index segments.

> Is
> there  
> any guideline on how often to optimize? Every 1000 documents or more?

Are not optimized indices causing you any problems (e.g. slow searches,
high number of open file handles)?  If no, then you don't even need to
optimize until those issues become... issues.

> Every week? Is there any concern if there are a lot of documents
> added without optimizing?

Possibly, see my answer above.

Otis


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org