You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Lucifer Hammer <lu...@gmail.com> on 2007/05/14 22:03:32 UTC

IndexReader.deleteDocument(int docid) equivalent in 2.1 IndexWriter?

I noticed that the API for Lucene 2.1+ includes a deleteDocuments(Term)
method in the IndexWriter.  I'd love to be able to change my application to
use it (we're constantly updating docs, which means opening/closing the
writer/reader each time we update a doc). I use complex queries to determine
which docs to delete.  Pre-2.1, I've just executed a search and then looped
through hits.id() as the parameter to the IndexReader.deleteDocument().  I'm
trying to figure out how to do the same for Lucene 2.1.  Is there a way to
specify the internal docid as a term? (So I can call
IndexWriter.deleteDocuments(internalId)?  Otherwise, I was thinking of using
a hit collector to fetch just our external UID.

Thanks for any guidance.

L

Re: IndexReader.deleteDocument(int docid) equivalent in 2.1 IndexWriter?

Posted by Doron Cohen <DO...@il.ibm.com>.
"Lucifer Hammer" <lu...@gmail.com> wrote on 14/05/2007 13:03:32:

> I noticed that the API for Lucene 2.1+ includes a deleteDocuments(Term)
> method in the IndexWriter.  I'd love to be able to change my application
to
> use it (we're constantly updating docs, which means opening/closing the
> writer/reader each time we update a doc). I use complex queries to
determine
> which docs to delete.  Pre-2.1, I've just executed a search and then
looped
> through hits.id() as the parameter to the IndexReader.deleteDocument().
I'm
> trying to figure out how to do the same for Lucene 2.1.  Is there a way
to
> specify the internal docid as a term? (So I can call
> IndexWriter.deleteDocuments(internalId)?

This could not work, b/c there is no way to guarantee that the internal
ids seen by the searcher are the same as those used/seen by the writer,
because
each merge operation might change the internal ids.

Note that when you were using the same indexReader (for search and delete)
it was guaranteed that internal ids used for delete and those
seen at search do match.

> Otherwise, I was thinking of using
> a hit collector to fetch just our external UID.

This would work. If it does not result in slowdown (a new field,
one more field to fetch during search-for-delete) then it seems
a better way to me.

Note however that since you search in order to find the docs
to be deleted, this would not save the need to reopen readers
in order to be able to find docs to be deleted among those
docs that were just added...

Btw, see also IndexWriter.updateDocument(Term, Document).

>
> Thanks for any guidance.
>
> L


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org