You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Jason Rutherglen <ja...@gmail.com> on 2008/06/22 15:09:21 UTC

ReaderCommit

Is there a reason ReaderCommit in DirectoryIndexReader.getIndexCommit() does
not support delete?  Is is proper behavior to use SnapshotDeletionPolicy and
then keep the IndexCommit around?  What is the difference where the former
does not support delete?

Re: ReaderCommit

Posted by Michael McCandless <lu...@mikemccandless.com>.
Jason Rutherglen wrote:

> For Ocean I created a workaround where the IndexCommits from  
> IndexDeletionPolicy are saved in a map in order to achieve deleting  
> based on the IndexReader.  It would be more straightforward to  
> delete from the IndexCommit in IndexReader.

It seems like we are mixing up deleting a whole commit point, vs  
deleting individual documents?  Or does Ocean somehow decide to delete  
a whole commit point based on which documents have been deleted?

> I realize people want to get away from IndexReader performing  
> updates, however, for my use case, realtime search updating from  
> IndexReader makes sense mainly for obtaining the doc ids of  
> deletions.  With IndexWriter managing the merges it would seem  
> difficult to expose doc numbers, but perhaps there is a way.

IndexWriter can now delete by query, but it sounds like that's not  
sufficient for Ocean?

Under the hood, IndexWriter has the infrastructure to hold pending  
deleted docIDs and update these docIDs when a merge is committed.  Ie,  
previously we forced a flush of all pending deletes on every flush/ 
merge, but now we buffer the docIDs across flushes/merges.  This means  
IndexWriter *could* delete by docID, however, none of this is exposed  
publicly.

Also, this doesn't solve the problem of how you would get the docIDs  
to delete in the first place (ie one must still use a separate  
IndexReader for that).

I'm not sure this helps you (Ocean) since you presumably need to flush  
deletes very quickly to have realtime search...

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: ReaderCommit

Posted by Jason Rutherglen <ja...@gmail.com>.
For Ocean I created a workaround where the IndexCommits from
IndexDeletionPolicy are saved in a map in order to achieve deleting based on
the IndexReader.  It would be more straightforward to delete from the
IndexCommit in IndexReader.  I realize people want to get away from
IndexReader performing updates, however, for my use case, realtime search
updating from IndexReader makes sense mainly for obtaining the doc ids of
deletions.  With IndexWriter managing the merges it would seem difficult to
expose doc numbers, but perhaps there is a way.

On Sun, Jun 22, 2008 at 3:30 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> Jason Rutherglen wrote:
>
>  Is there a reason ReaderCommit in DirectoryIndexReader.getIndexCommit()
>> does not support delete?
>>
>
> I think we are getting away from using IndexReader to make changes to the
> index, so, I didn't want to enable deleting a commit point from IndexReader.
>
>  Is is proper behavior to use SnapshotDeletionPolicy and then keep the
>> IndexCommit around?
>>
>
> Likely not really.  If you intend to keep the commit around indefinitely
> you should use a better matched deletion policy that keeps track, over time,
> of what should not be deleted.  EG SnapshotDeletionPolicy does not save to
> disk which commit points are still held open.
>
>  What is the difference where the former does not support delete?
>>
>
> Not sure what you're asking here?
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: ReaderCommit

Posted by Michael McCandless <lu...@mikemccandless.com>.
Jason Rutherglen wrote:

> Is there a reason ReaderCommit in  
> DirectoryIndexReader.getIndexCommit() does not support delete?

I think we are getting away from using IndexReader to make changes to  
the index, so, I didn't want to enable deleting a commit point from  
IndexReader.

> Is is proper behavior to use SnapshotDeletionPolicy and then keep  
> the IndexCommit around?

Likely not really.  If you intend to keep the commit around  
indefinitely you should use a better matched deletion policy that  
keeps track, over time, of what should not be deleted.  EG  
SnapshotDeletionPolicy does not save to disk which commit points are  
still held open.

> What is the difference where the former does not support delete?

Not sure what you're asking here?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org