You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Simon Wistow <si...@thegestalt.org> on 2008/02/27 09:42:58 UTC

Atomicity and AutoCommit

I currently have a set up that indexes into RAM and then periodically 
merges that into a disk based index. 

Searches are done from the disk based index and deletes are handled by 
keeping a list of deleted documents, filtering out search results and 
applying the deletes to the index at merge time.

All this was done to make sure that we didn't corrupt the index (which 
we'd seen happen a few times when the indexing machine failed for 
whatever reason). With this scheme if the machine fails then all that's 
lost is the RAM index and the list of deletes. We then just simply play 
back all actions since the last merge and we're back to where we 
started.

However it occurred to me that this might all be redundant now with 
Lucene 2.3 (it's possible it might have always been redundant come to 
think of it) - should I just open a Disk based Index with 
autocommit=false and then periodically commit the changes by close()ing 
and then re-open()ing the Disk index ? Is that atomic? i.e is there a 
situation using this whereby the index could become corrupted?

Thanks,

Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Atomicity and AutoCommit

Posted by Michael McCandless <lu...@mikemccandless.com>.

Simon Wistow wrote:

> On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
>>
>> When you previously saw corruption was it due to an OS or machine
>> crash (or power cord got pulled)?  If so, you were likely hitting
>> LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
>> at some point) but is not fixed in 2.3.
>
> Yes - it's power outages and other unnatural events (sysadmins
> accidentally kill -9ing the process) that caused it.

OK power outage can definitely cause corruption.  This has been a long
standing, but only recently uncovered, and now fixed in 2.4, issue
(LUCENE-1044).  But I believe kill -9 should not cause corruption.

BTW hot backups, as of 2.3, are now very easy.  Just use
SnapshotDeletionPolicy when creating your writer.  Making frequent
backups is a good safeguard too...

> What's the chances of me backporting the fix to 2.3 or should I just
> wait for 2.4?

It unfortunately was a fairly large change; I'm not sure how cleanly
the patch will apply to 2.3.  Maybe try trunk (but beware: the index
format changed with LUCENE-1044 to add an integrity checksum to
the end of the segments_N file)...

> Come 2.4 is my buffering to RAM redundant?

Well, as Mark said, if your IO system does not lie on fsync, then  
buffering
to RAM is redundant.  If it does lie, you still have open risk of  
corruption and
so buffering to RAM probably reduces (but doesn't eliminate) the risk.

Also, as of 2.3, manually buffering to RAMDirectory should no longer
give a big performance win over just giving that RAM to the
IndexWriter as its buffer instead.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Atomicity and AutoCommit

Posted by Mark Miller <ma...@gmail.com>.

You need to make sure your storage does not lie in response to an fsync 
command. If it does (most commercial stuff does), you cannot guaranty no 
corruption. Search google for "your harddrive lies to you" or something.

It shouldnt be that hard to take the patch from the issue and apply it 
to a checked out version of 2.3 right? I don't think it relies on other 
2.4 stuff as there isnt much of it yet.

Simon Wistow wrote:
> On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
>   
>> When you previously saw corruption was it due to an OS or machine
>> crash (or power cord got pulled)?  If so, you were likely hitting
>> LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
>> at some point) but is not fixed in 2.3.
>>     
>
> Yes - it's power outages and other unnatural events (sysadmins 
> accidentally kill -9ing the process) that caused it.
>
> What's the chances of me backporting the fix to 2.3 or should I just 
> wait for 2.4?
>
> Come 2.4 is my buffering to RAM redundant?
>
> Thanks,
>
> Simon
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Atomicity and AutoCommit

Posted by Simon Wistow <si...@thegestalt.org>.

On Wed, Feb 27, 2008 at 09:38:55AM -0500, Michael McCandless said:
> 
> When you previously saw corruption was it due to an OS or machine
> crash (or power cord got pulled)?  If so, you were likely hitting
> LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
> at some point) but is not fixed in 2.3.

Yes - it's power outages and other unnatural events (sysadmins 
accidentally kill -9ing the process) that caused it.

What's the chances of me backporting the fix to 2.3 or should I just 
wait for 2.4?

Come 2.4 is my buffering to RAM redundant?

Thanks,

Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Atomicity and AutoCommit

Posted by Michael McCandless <lu...@mikemccandless.com>.

When you previously saw corruption was it due to an OS or machine
crash (or power cord got pulled)?  If so, you were likely hitting
LUCENE-1044, which is fixed on the trunk version of Lucene (to be 2.4
at some point) but is not fixed in 2.3.

If that is what you were hitting, then unfortunately neither buffering
updates into RAM nor using autoCommit=false in 2.3 will fully protect
you from this issue.  Though, both of these approaches should reduce
your chance of hitting LUCENE-1044 since they both reduce frequency of
commits to the index.

Mike

Simon Wistow wrote:

> I currently have a set up that indexes into RAM and then periodically
> merges that into a disk based index.
>
> Searches are done from the disk based index and deletes are handled by
> keeping a list of deleted documents, filtering out search results and
> applying the deletes to the index at merge time.
>
> All this was done to make sure that we didn't corrupt the index (which
> we'd seen happen a few times when the indexing machine failed for
> whatever reason). With this scheme if the machine fails then all  
> that's
> lost is the RAM index and the list of deletes. We then just simply  
> play
> back all actions since the last merge and we're back to where we
> started.
>
> However it occurred to me that this might all be redundant now with
> Lucene 2.3 (it's possible it might have always been redundant come to
> think of it) - should I just open a Disk based Index with
> autocommit=false and then periodically commit the changes by close() 
> ing
> and then re-open()ing the Disk index ? Is that atomic? i.e is there a
> situation using this whereby the index could become corrupted?
>
> Thanks,
>
> Simon
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org