You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by João Rodrigues <an...@gmail.com> on 2008/02/01 17:41:48 UTC

Checking if a given document is indexed

Hello all,

I'm using pylucene to index documents and I'm interested in checking if a
given document from the list A (that is going to be indexed) is already
indexed. Can I do it?

Thanks in advance,

João Rodrigues

Re: Checking if a given document is indexed

Posted by Chris Hostetter <ho...@fucit.org>.
: I'm using pylucene to index documents and I'm interested in checking if a
: given document from the list A (that is going to be indexed) is already
: indexed. Can I do it?

FYI: PyLucene is not an Apache project, it has it's own mailing lists 
and documentation that you may want to consult...
	http://pylucene.osafoundation.org/

I personally don't know anything about PyLucene, and I have no idea what 
features it may add -- but that said, assuming it is a simple wrapper 
arround the Lucene-Java APIs, and adds no extra functionality then you'll 
need som way to indentify a document to determine if it is in the index or 
not .. essentailly: you search for each document you have by some criteria 
to see if it's there.

if your document space allows for a "uniquey key" on each document, just 
make sure it is indexed ... if not, then compute something appropraite 
(ie: and MD5 sum) and use that.



-Hoss