You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chuck Williams <ch...@allthingslocal.com> on 2005/04/05 05:53:54 UTC
[Fwd: Re: Time taken in Indexing when the index is already huge]
Goel, Nikhil writes (4/4/2005 7:14 PM):
>Hi,
>
>
>
>I have been using lucene-1.3.jar for quite some time and we are using another library to store the index in DB.
>
>When we started indexing the writer.optimize used to take in the range of 600-800 milliseconds to return but now our index has grown to huge proportion and its around 10 MB hence the writer.optimize is taking around 30-40 seconds and it is not acceptable for our solution. I put the timings on writer.optimize() and it's the one which takes most of this time.
>
>
>
>So I am just wondering if someone is facing the same problem in indexing the data when the index is already huge or is there another way to manage such huge index.
>
>
>
>Here is the simple code which we use to index the data.
>
>IndexWriter writer = new IndexWriter(dbDirectory, new StandardAnalyzer(), false); //Create an indexwriter
>
>writer.addDocument(doc); //doc is of type org.apache.lucene.document.Document...
>
>writer.optimize(); //optimize is called on indexwriter..This is the one which takes most of the time and is responsible for the delay.
>
>writer.close(); // indexwriter is closed
>
>
Does this code imply you are optimizing after every new document is
indexed? 10MB is actually a pretty small index. Depending on your
inflow of documents, you should be able to optimize maybe once a day,
during your application's least busy period. Your IndexSearcher can
still search your documents effectively while the index is unoptimized.
As a first step, try not optimizing at all.
Chuck
>
>
>
>
>The time taken by optimize call grows a lot when the index is of larger size. I tried to look it up on Erik Hatcher and Otis Gospodnetić <http://www.manning.com/hatcher2#author#author> book too but everywhere it says Lucene is quite scalable and don't have trouble in indexing even with huge data. Can anyone please provide some insight into this?
>
>
>
>Thanks.
>
>Nikhil
>
>
>
>
>
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org