You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Mark Florence <mf...@airsmail.com> on 2004/07/20 19:52:03 UTC

Very slow IndexReader.open() performance

Hi -- We have a large index (~4m documents, ~14gb) that we haven't been
able to optimize for some time, because the JVM throws OutOfMemory, after
climbing to the maximum we can throw at it, 2gb. 

In fact, the OutOfMemory condition occurred most recently during a segment 
merge operation. maxMergeDocs was set to the default, and we seem to have
gotten around this problem by setting it to some lower value, currently
100,000. The index is highly interactive so I took the hint from earlier
posts to set it to this value.

Good news! No more OutOfMemory conditions.

Bad news: now, calling IndexReader.open() is taking 20+ seconds, and it 
is killing performance.

I followed the design pattern in another earlier post from Doug. I take a
batch of deletes, open an IndexReader, perform the deletes, then close it.
Then I take a batch of adds, open an IndexWriter, perform the adds, then
close it. Then I get a new IndexSearcher for searching.

But because the index is so interactive, this sequence repeats itself all
the time. 

My question is, is there a better way? Performance was fine when I could
optimize. Can I hold onto singleton a IndexReader/IndexWriter/IndexSearcher
to avoid the overhead of the open?

Any help would be most gratefully received.

Mark Florence, CTO, AIRS
mflorence@airsmail.com
800-897-7714x1703


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org

Re: Very slow IndexReader.open() performance

Posted by Doug Cutting <cu...@apache.org>.

Optimization should not require huge amounts of memory.  Can you tell a 
bit more about your configuration:  What JVM?  What OS?  How many 
fields?  What mergeFactor have you used?

Also, please attach the output of 'ls -l' of your index directory, as 
well as the stack trace you see when OutOfMemory is thrown.

Thanks,

Doug

Mark Florence wrote:
> Hi -- We have a large index (~4m documents, ~14gb) that we haven't been
> able to optimize for some time, because the JVM throws OutOfMemory, after
> climbing to the maximum we can throw at it, 2gb. 
> 
> In fact, the OutOfMemory condition occurred most recently during a segment 
> merge operation. maxMergeDocs was set to the default, and we seem to have
> gotten around this problem by setting it to some lower value, currently
> 100,000. The index is highly interactive so I took the hint from earlier
> posts to set it to this value.
> 
> Good news! No more OutOfMemory conditions.
> 
> Bad news: now, calling IndexReader.open() is taking 20+ seconds, and it 
> is killing performance.
> 
> I followed the design pattern in another earlier post from Doug. I take a
> batch of deletes, open an IndexReader, perform the deletes, then close it.
> Then I take a batch of adds, open an IndexWriter, perform the adds, then
> close it. Then I get a new IndexSearcher for searching.
> 
> But because the index is so interactive, this sequence repeats itself all
> the time. 
> 
> My question is, is there a better way? Performance was fine when I could
> optimize. Can I hold onto singleton a IndexReader/IndexWriter/IndexSearcher
> to avoid the overhead of the open?
> 
> Any help would be most gratefully received.
> 
> Mark Florence, CTO, AIRS
> mflorence@airsmail.com
> 800-897-7714x1703
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org