You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by João Rodrigues <an...@gmail.com> on 2008/02/01 17:41:48 UTC
Checking if a given document is indexed
Hello all,
I'm using pylucene to index documents and I'm interested in checking if a
given document from the list A (that is going to be indexed) is already
indexed. Can I do it?
Thanks in advance,
João Rodrigues
Re: Checking if a given document is indexed
Posted by Chris Hostetter <ho...@fucit.org>.
: I'm using pylucene to index documents and I'm interested in checking if a
: given document from the list A (that is going to be indexed) is already
: indexed. Can I do it?
FYI: PyLucene is not an Apache project, it has it's own mailing lists
and documentation that you may want to consult...
http://pylucene.osafoundation.org/
I personally don't know anything about PyLucene, and I have no idea what
features it may add -- but that said, assuming it is a simple wrapper
arround the Lucene-Java APIs, and adds no extra functionality then you'll
need som way to indentify a document to determine if it is in the index or
not .. essentailly: you search for each document you have by some criteria
to see if it's there.
if your document space allows for a "uniquey key" on each document, just
make sure it is indexed ... if not, then compute something appropraite
(ie: and MD5 sum) and use that.
-Hoss