You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ori Schnaps <os...@gmail.com> on 2006/01/24 01:00:27 UTC

performance of implication an index with large number of documents.

Hi,

Apologies if this question has being asked before on this list.

I am working on an application with a Lucene index whose performance
(response time for a query) has started degrading as its size has
increase.

The index is made up of approximately 10 million documents that have
11 fields.  The average document size is less then 1k.  The index has
a total of 13 million terms.  The total index size is about 2.2 gig. 
The index is being updated relatively aggressively.  In a 24hr period
there may be any where from 500k to 3 million updates.

What I have noticed is that as the document number increased from 6
million to 10 million, the response time for a query has continually
increased from 0.5 seconds to ~2+ seconds.

I am using a Java j2sdk1.4.2_08 and Lucene 1.4.3.  The container is
tomcat and the java process is allocated 2gig heap.  The heap is
shared between the Lucene index and the end user application.

Our initial inclination is to pull the index out of the application
and load it fully into memory on a separate box.  Anyone have any
experience with performance for index of this nature and what kind of
request time should be expected from Lucene?

thanks
ori

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org