You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by David Sitsky <si...@nuix.com.au> on 2004/05/10 09:41:55 UTC

Memory requirements for optimize() on compound index high?

Hi,

I am working on an application which uses Lucene 1.3 Final which uses the 
compound index format on Win32 Sun JVM 1.4.1_02.  I have set 
maxFieldLength for the index writer to 1,000,000, as often I have to index 
potentially very large documents, which contain information that must be 
indexed.

All other index writer parameters have their default values.  The 
application loads all documents in a batch phase, and then allows the user 
to perform searchers.  Typically, no new documents are added afterwards.

Given the large size set for maxFieldLength, I have allocated 512MB of 
memory to the JVM.  For indexing 1,000,000 complex documents, with 
potentially around 30 fields each, this seems to work fine.

I have noticed that when performing an optimize() on this index at the end 
of a batch load, the memory requirements seem to be much higher.  I was 
receiving OutOfMemoryErrors for a 512MB JVM.  I increased the JVM size to 
1 GIG, and the optimize operation completed successfully.

Task manager reported a peak VM size of 810MB during the optimize() 
operation, from a newly-created JVM.  FWIW, the final index size was 11 
gigabytes - most document fields are stored in the index.

Do people have similar experiences to this when calling optimize() on a 
compound index?

Are there any ways I can reduce the amount of memory required, apart from 
making maxFieldLength smaller?

Are there any way of determining in advance the kind of memory requirements 
optimize() will require?  Its highly undesirable to receive 
OutOfMemoryErrors during optimize().  I guess the user can still search on 
an unoptimized index which is better than nothing...

-- 
Cheers,
David

This message is intended only for the named recipient.  If you are not the 
intended recipient you are notified that disclosing, copying, distributing 
or taking any action  in reliance on the contents of this information is 
strictly prohibited.


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org