You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Volodymyr Bychkoviak <vb...@i-hypergrid.com> on 2005/03/01 18:19:24 UTC
Large Index managing
Hi,
just an idea how to manage large index that is updated very often.
Very often there is need to update an document in index. To update
document in index you should delete old document from index and then add
new one. In most cases it require you to open IndexReader, delete
document, close IndexReader, create IndexWriter, add document, close
IndexWriter, and re-open IndexSearcher (if index is searched heavily).
Profiling some applications I found that most time is spend in
IndexReader.open() method. Also it produces many objects, so it also
gives GC overhead.
Idea to optimize this process is to create two indexes. One main index
that could be very large and second index that will serve as "change
buffer". We can keep one IndexReader open for the first index. (and use
it for searching and for deleting old documents). Second index is small
and we can reopen IndexReader frequently when needed.
when second index reaches some number of documents we can merge it with
main index.
to search this "multi" index we could use MultiSearcher over this two
indexes but with little trick: first IndexSearcher is kept same during
all time till second index is merged with main and second IndexSearcher
is reopened when second index changes.
It is just idea. (It is not tested)
Will it help to improve speed of updating large index and lower memory
overhead?
Any comments?
Regards,
Volodymyr Bychkoviak
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org