You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by hu andy <an...@gmail.com> on 2006/04/28 11:56:59 UTC

Ask for a better solution for the case

Hi, I hava an application that need mark the retrieved documents  which have
been read. So the next time I needn't read the marked documents again.

    I have an idea  that adding a particular field into the indexed
document. But as lucene have no update method, I have to delete that
document, and add it again.  I think it seems a little stupid. Or I can use
a database to satisfy the mark requirement, but how does the database relate
to lucene index, especially when i want to retrieve document that I have
read? Maybe there is a better idea.

    Any suggestion will be greatly appreciated.

Re: Ask for a better solution for the case

Posted by Erick Erickson <er...@gmail.com>.
This one's fairly wild, I'm interested to see what the gurus think...

You could create a bitset and mark each document retrieved by the
appropriate bit position (using the Lucene document id). Persist this bitset
(assuming you need it across sessions). Be careful, I wouldn't persist it
via the toString(), persist it as a binary entity. It depends on how many
docs we're talking about I guess....

Anyway, let's say you have accumulated one of these. Create a filter with
the XOR of the persisted bitset, and pass that filter on to subsequent
searches...... When the search comes back, set the bits in your (persisted)
bitset and save it away. Repeat as needed....

I have no idea if this would help in your particular situation... And, any
time your index changed, any persisted bitsets would be invalid.

Anyway, it may even work. See the Filters in Lucene for what filters are all
about.

Erick

Re: Ask for a better solution for the case

Posted by Doug Cutting <cu...@apache.org>.
hu andy wrote:
> Hi, I hava an application that need mark the retrieved documents  which have
> been read. So the next time I needn't read the marked documents again.

You could mark the documents as deleted, then later clear deletions.  So 
long as you don't close the IndexReader, the deletions will never be 
flushed to disk.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org