You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Wilton, Reece" <Re...@dig.com> on 2003/07/12 01:05:32 UTC

Advice on updating an index?

Hi,

I'm having a bit of trouble figuring out the logic for deleting
documents from an index.  Any advice is appreciated!

1) I created an index with an IndexWriter and then optimized it and
closed it.  Then I opened an IndexReader and deleted each document using
indexReader.delete(new Term("ID", id)).  Then I opened an IndexWriter
again and added the docs.

This worked wonderfully and was fast!  But when I'm updating an index,
I'd prefer to delete the old document and then add the new one.  I don't
want to remove all the docs and then re-add them because people who are
currently searching will get no results.

2) I created an index with an IndexWriter and then optimized it and
closed it.  Then I opened an IndexReader and IndexWriter.  
For each document:
- I delete the document using the IndexReader
- I add the document using the IndexWriter
At the end I close the reader and use the writer to optimize.

This doesn't work. :-( The IndexReader never finds anything to delete.
I presume its because the IndexWriter has the index open.

3) I created an index with an IndexWriter and then optimized it and
closed it.  Then I opened the index with an IndexWriter.
For each document:
- I create a new IndexReader, delete the document and close the
IndexReader
- I add the document using the IndexWriter
At the end I use the writer to optimize.

This doesn't work. :-( The IndexReader never finds anything to delete.
I presume its because the IndexWriter has the index open.

4) I created an index with an IndexWriter and then optimized it and
closed it.
For each document:
- I create a new IndexReader, delete the document and close the
IndexReader
- I create a new IndexWriter, add the document and close the IndexWriter
At the end I open the index with an IndexWriter and then optimize it and
close it.

This works!  But it is pretty slow (compared to the other three tests).
Is this the best way of doing this?

BTW, I'm using Lucene 1.3 rc1 on Windows XP with JDK 1.4.2.

Thanks,
Reece

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Advice on updating an index?

Posted by Ype Kingma <yk...@xs4all.nl>.
Reece,

On Friday 11 July 2003 16:05, Wilton, Reece wrote:
> Hi,
>
> I'm having a bit of trouble figuring out the logic for deleting
> documents from an index.  Any advice is appreciated!

<snip 75% of the experiments>

> 4) I created an index with an IndexWriter and then optimized it and
> closed it.
> For each document:
> - I create a new IndexReader, delete the document and close the
> IndexReader
> - I create a new IndexWriter, add the document and close the IndexWriter
> At the end I open the index with an IndexWriter and then optimize it and
> close it.
>
> This works!  But it is pretty slow (compared to the other three tests).
> Is this the best way of doing this?

AFAIK, yes.
You can speed this up by using multiple documents, ie. use
a document set.
Also, you don't need to close the index writer before optimizing.

One variation: you might leave the IndexReader open in case you
need it for searching, but I wouldn't recommend that under Windows
because there an open file cannot be deleted from a directory.
Lucene deletes such files during later optimizations.

Kind regards,
Ype

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org