You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Earwin Burrfoot <ea...@gmail.com> on 2010/04/06 16:11:53 UTC

Getting fsync out of the loop

So, I want to pump my IndexWriter hard and fast with documents.

Removing fsync from FSDirectory helps. But for that I pay with possibility of
index corruption, not only if my node suddenly loses
power/kernelpanics, but also if it
runs out of disk space (which happens more frequently).

I invented the following solution:
We write a special deletion policy that resembles SnapshotDeletionPolicy.
At all times it takes hold of "current synced commit" and preserves
it. Once every N minutes
a special thread takes latest commit, syncs it and nominates as
"current synced commit". The
previous one gets deleted.

Now we are disastery-proof, and do fsync asynchronously from indexing
threads. We pay for this with
somewhat bigger transient disc usage, and probably losing a few
minutes worth of updates in
case of a crash, but that's acceptable.

How does this sound?

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Shai Erera <se...@gmail.com>.

How often is fsync called? If it's just during calls to commit, then is that
that expensive? I mean, how often do you call commit?

If that's that expensive (do you have some numbers to share) then I think
that's be a neat idea. Though "losing a few minutes worth of updates" may
sometimes be unrecoverable, depending on the scenario, bur I guess for those
cases the 'standard way' should be used.

What if your background thread simply committed every couple of minutes?
What's the difference between taking the snapshot (which means you had to
call commit previously) and commit it, to call iw.commit by a backgroud
merge?

Shai

On Tue, Apr 6, 2010 at 5:11 PM, Earwin Burrfoot <ea...@gmail.com> wrote:

> So, I want to pump my IndexWriter hard and fast with documents.
>
> Removing fsync from FSDirectory helps. But for that I pay with possibility
> of
> index corruption, not only if my node suddenly loses
> power/kernelpanics, but also if it
> runs out of disk space (which happens more frequently).
>
> I invented the following solution:
> We write a special deletion policy that resembles SnapshotDeletionPolicy.
> At all times it takes hold of "current synced commit" and preserves
> it. Once every N minutes
> a special thread takes latest commit, syncs it and nominates as
> "current synced commit". The
> previous one gets deleted.
>
> Now we are disastery-proof, and do fsync asynchronously from indexing
> threads. We pay for this with
> somewhat bigger transient disc usage, and probably losing a few
> minutes worth of updates in
> case of a crash, but that's acceptable.
>
> How does this sound?
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Getting fsync out of the loop

Posted by Earwin Burrfoot <ea...@gmail.com>.

I don't have the system at hand now, but if I remember right fsync
took like 100-200ms.

2010/4/7 Shai Erera <se...@gmail.com>:
> Earwin - do you have some numbers to share on the running time of the
> indexing application? You've mentioned that if you take out fsync into a BG
> thread, the running time improves, but I'm curious to know by how much.
>
> Shai
>
> On Wed, Apr 7, 2010 at 2:26 AM, Earwin Burrfoot <ea...@gmail.com> wrote:
>>
>> > Running out of disk space with fsync disabled won't lead to corruption.
>> > Even kill -9 the JRE process with fsync disabled won't corrupt.
>> > In these cases index just falls back to last successful commit.
>> >
>> > It's "only" power loss / OS / machine crash where you need fsync to
>> > avoid possible corruption (corruption may not even occur w/o fsync if
>> > you "get lucky").
>>
>> Sorry to disappoint you, but running out of disk space is worse than kill
>> -9.
>> You can write down the file (to cache in fact), close it, all without
>> getting any
>> exceptions. And then it won't get flushed to disk because the disk is
>> full.
>> This can happen to segments file (and old one is deleted with default
>> deletion
>> policy). This can happen to fat freq/prox files mentioned in segments file
>> (and yeah, the old segments file is deleted, so no falling back).
>>
>> > What if your background thread simply committed every couple of minutes?
>> > What's the difference between taking the snapshot (which means you had
>> > to call commit previously) and commit it, to call iw.commit by a
>> > backgroud merge?
>> --
>> > But: why do you need to commit so often?
>> To see stuff on reopen? Yes, I know about NRT.
>>
>> > You've reinvented autocommit=true!
>> ?? I'm doing regular commits, syncing down every Nth of it.
>>
>> > Doesn't this just BG the syncing?  Ie you could make a dedicated
>> > thread to do this.
>> Yes, exactly, this BGs the syncing to a dedicated thread. Threads
>> doing indexation/merging can continue unhampered.
>>
>> > One possible win with this aproach is.... the cost of fsync should go
>> > way down the longer you wait after writing bytes to the file and
>> > before calling fsync.  This is because typically OS write caches
>> > expire by time (eg 30 seconds) so if you want long enough the bytes
>> > will already at least be delivered to the IO system (but the IO system
>> > can do further caching which could still take time).  On windows at
>> > least I definitely noticed this effect -- wait some before fync'ing
>> > and it's net/net much less costly.
>> Yup. In fact you can just hold on to the latest commit for N seconds,
>> than switch to the new latest commit.
>> OS will fsync everything for you.
>>
>>
>> I'm just playing around with stupid idea. I'd like to have NRT
>> look-alike without binding readers and writers. :)
>> Right now it's probably best for me to save my time and cut over to
>> current NRT.
>> But. An important lesson was learnt - no fsyncing blows up your index
>> on out-of-disk-space.
>>
>> --
>> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
>> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
>> ICQ: 104465785
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>



-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Shai Erera <se...@gmail.com>.

Earwin - do you have some numbers to share on the running time of the
indexing application? You've mentioned that if you take out fsync into a BG
thread, the running time improves, but I'm curious to know by how much.

Shai

On Wed, Apr 7, 2010 at 2:26 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> > Running out of disk space with fsync disabled won't lead to corruption.
> > Even kill -9 the JRE process with fsync disabled won't corrupt.
> > In these cases index just falls back to last successful commit.
> >
> > It's "only" power loss / OS / machine crash where you need fsync to
> > avoid possible corruption (corruption may not even occur w/o fsync if
> > you "get lucky").
>
> Sorry to disappoint you, but running out of disk space is worse than kill
> -9.
> You can write down the file (to cache in fact), close it, all without
> getting any
> exceptions. And then it won't get flushed to disk because the disk is full.
> This can happen to segments file (and old one is deleted with default
> deletion
> policy). This can happen to fat freq/prox files mentioned in segments file
> (and yeah, the old segments file is deleted, so no falling back).
>
> > What if your background thread simply committed every couple of minutes?
> > What's the difference between taking the snapshot (which means you had
> > to call commit previously) and commit it, to call iw.commit by a
> backgroud merge?
> --
> > But: why do you need to commit so often?
> To see stuff on reopen? Yes, I know about NRT.
>
> > You've reinvented autocommit=true!
> ?? I'm doing regular commits, syncing down every Nth of it.
>
> > Doesn't this just BG the syncing?  Ie you could make a dedicated
> > thread to do this.
> Yes, exactly, this BGs the syncing to a dedicated thread. Threads
> doing indexation/merging can continue unhampered.
>
> > One possible win with this aproach is.... the cost of fsync should go
> > way down the longer you wait after writing bytes to the file and
> > before calling fsync.  This is because typically OS write caches
> > expire by time (eg 30 seconds) so if you want long enough the bytes
> > will already at least be delivered to the IO system (but the IO system
> > can do further caching which could still take time).  On windows at
> > least I definitely noticed this effect -- wait some before fync'ing
> > and it's net/net much less costly.
> Yup. In fact you can just hold on to the latest commit for N seconds,
> than switch to the new latest commit.
> OS will fsync everything for you.
>
>
> I'm just playing around with stupid idea. I'd like to have NRT
> look-alike without binding readers and writers. :)
> Right now it's probably best for me to save my time and cut over to current
> NRT.
> But. An important lesson was learnt - no fsyncing blows up your index
> on out-of-disk-space.
>
> --
> Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
> Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
> ICQ: 104465785
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Re: Getting fsync out of the loop

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Thu, Apr 8, 2010 at 6:21 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> But, IW doesn't let you "hold on to" checkpoints... only to commits.
>>
>> Ie SnapshotDP will only "see" actual commit/close calls, not
>> intermediate checkpoints like a random segment merge completing, a
>> flush happening, etc.
>>
>> Or... maybe you would in fact call commit frequently from the main
>> threads (but with fsync disabled), and then your DP holds onto these
>> "fake commits", periodically picking one of them to do the "real"
>> fsync ing?
> Yeah, that's exactly what I tried to describe in my initial post :)

Ahh ok then it makes more sense.  But still you shouldn't commit that
often (even with fake fsync) since it must flush the segment.

>>>>> I'm just playing around with stupid idea. I'd like to have NRT
>>>>> look-alike without binding readers and writers. :)
>>>> I see... well binding durability & visibility will always be costly.
>>>> This is why Lucene decouples them (by making NRT readers available).
>>> My experiments do the same, essentially.
>>> But after I understood that to perform deletions IW has to load term indexes
>>> anyway, I'm almost ready to give up and go for intertwined IW/IR mess :)
>> Hey if you really think it's a mess, post a patch that cleans it up :)
> Uh oh. Let me finish current one, first.

Heh, yes :)

> Second - I don't know yet how
> this should look like.
> Something along the lines of deletions/norms writers being extracted
> from segment reader
> and reader pool being made external to IW??

Yeah, reader pool should be pulled out of IW, and I think IW should be
split into "that which manages the segment infos", "that which
adds/deletes docs", and "the rest" (merging, addIndexes*)?  (There's
an issue open for this refactoring...).

I'm not sure about deletions/norms writers being extracted from SR....
I think delete ops would still go through IW?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Earwin Burrfoot <ea...@gmail.com>.

> But, IW doesn't let you "hold on to" checkpoints... only to commits.
>
> Ie SnapshotDP will only "see" actual commit/close calls, not
> intermediate checkpoints like a random segment merge completing, a
> flush happening, etc.
>
> Or... maybe you would in fact call commit frequently from the main
> threads (but with fsync disabled), and then your DP holds onto these
> "fake commits", periodically picking one of them to do the "real"
> fsync ing?
Yeah, that's exactly what I tried to describe in my initial post :)

>>>> I'm just playing around with stupid idea. I'd like to have NRT
>>>> look-alike without binding readers and writers. :)
>>> I see... well binding durability & visibility will always be costly.
>>> This is why Lucene decouples them (by making NRT readers available).
>> My experiments do the same, essentially.
>> But after I understood that to perform deletions IW has to load term indexes
>> anyway, I'm almost ready to give up and go for intertwined IW/IR mess :)
> Hey if you really think it's a mess, post a patch that cleans it up :)
Uh oh. Let me finish current one, first. Second - I don't know yet how
this should look like.
Something along the lines of deletions/norms writers being extracted
from segment reader
and reader pool being made external to IW??

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Wed, Apr 7, 2010 at 3:27 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> No, this doesn't make sense.  The OS detects a disk full on accepting
>> the write into the write cache, not [later] on flushing the write
>> cache to disk.  If the OS accepts the write, then disk is not full (ie
>> flushing the cache will succeed, unless some other not-disk-full
>> problem happens).
>>
>> Hmmm, at least, normally.  What OS/IO system were you on when you saw
>> corruption due to disk full when fsync is disabled?
>>
>> I'm still skeptical that disk full even with fsync disabled can lead
>> to corruption.... I'd like to see some concrete proof :)
>
> Linux 2.6.30-1-amd64, ext3, simple scsi drive

Hmmmmm.  Linux should detect disk full on the initial write.

> I checked with our resident DB brainiac, he says such things are possible.
>
> Okay, I'm not 100% sure this is the cause of my corruptions. It just happened
> that when the index got corrupted, disk space was also used up - several times.
> I had that silent-fail-to-write theory and checked it up with some knowledgeable
> people. Even if they are right, I can be mistaken and the root cause
> is different.

OK... if you get a more concrete case where disk full causes
corruption when you disable fsync, please post details back.  From
what I understand this should never happen.

>> You're mixing up terminology a bit here -- you can't "hold on to the
>> latest commit then switch to it".  A commit (as sent to the deletion
>> policy) means a *real* commit (ie, IW.commit or IW.close was called).
>> So I think your BG thread would simply be calling IW.commit every N
>> seconds?
> Under "hold on to" I meant - keep from being deleted, like SnapshotDP does.

But, IW doesn't let you "hold on to" checkpoints... only to commits.

Ie SnapshotDP will only "see" actual commit/close calls, not
intermediate checkpoints like a random segment merge completing, a
flush happening, etc.

Or... maybe you would in fact call commit frequently from the main
threads (but with fsync disabled), and then your DP holds onto these
"fake commits", periodically picking one of them to do the "real"
fsync ing?

>>> I'm just playing around with stupid idea. I'd like to have NRT
>>> look-alike without binding readers and writers. :)
>> I see... well binding durability & visibility will always be costly.
>> This is why Lucene decouples them (by making NRT readers available).
> My experiments do the same, essentially.
>
> But after I understood that to perform deletions IW has to load term indexes
> anyway, I'm almost ready to give up and go for intertwined IW/IR mess :)

Hey if you really think it's a mess, post a patch that cleans it up :)

>> BTW, if you know your OS/IO system always persists cached writes w/in
>> N seconds, a safe way to avoid fsync is to use a by-time expiring
>> deletion policy.  Ie, a commit stays alive as long as its age is less
>> than X... DP's unit test has such a policy.  But you better really
>> know for sure that the OS/IO system guarantee that :)
> Yeah. I thought of it, but it is even more shady :)

I agree.  And even if you know you're on Linux, and that your pdflush
flushes after X seconds, you still have the IO system to contend with.

Best to stick with fsync, commit only for safety as needed by the app,
and use NRT for fast visibility.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Earwin Burrfoot <ea...@gmail.com>.

> No, this doesn't make sense.  The OS detects a disk full on accepting
> the write into the write cache, not [later] on flushing the write
> cache to disk.  If the OS accepts the write, then disk is not full (ie
> flushing the cache will succeed, unless some other not-disk-full
> problem happens).
>
> Hmmm, at least, normally.  What OS/IO system were you on when you saw
> corruption due to disk full when fsync is disabled?
>
> I'm still skeptical that disk full even with fsync disabled can lead
> to corruption.... I'd like to see some concrete proof :)

Linux 2.6.30-1-amd64, ext3, simple scsi drive
I checked with our resident DB brainiac, he says such things are possible.

Okay, I'm not 100% sure this is the cause of my corruptions. It just happened
that when the index got corrupted, disk space was also used up - several times.
I had that silent-fail-to-write theory and checked it up with some knowledgeable
people. Even if they are right, I can be mistaken and the root cause
is different.

> You're mixing up terminology a bit here -- you can't "hold on to the
> latest commit then switch to it".  A commit (as sent to the deletion
> policy) means a *real* commit (ie, IW.commit or IW.close was called).
> So I think your BG thread would simply be calling IW.commit every N
> seconds?
Under "hold on to" I meant - keep from being deleted, like SnapshotDP does.

>> I'm just playing around with stupid idea. I'd like to have NRT
>> look-alike without binding readers and writers. :)
> I see... well binding durability & visibility will always be costly.
> This is why Lucene decouples them (by making NRT readers available).
My experiments do the same, essentially.

But after I understood that to perform deletions IW has to load term indexes
anyway, I'm almost ready to give up and go for intertwined IW/IR mess :)

> BTW, if you know your OS/IO system always persists cached writes w/in
> N seconds, a safe way to avoid fsync is to use a by-time expiring
> deletion policy.  Ie, a commit stays alive as long as its age is less
> than X... DP's unit test has such a policy.  But you better really
> know for sure that the OS/IO system guarantee that :)
Yeah. I thought of it, but it is even more shady :)


-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Tue, Apr 6, 2010 at 7:26 PM, Earwin Burrfoot <ea...@gmail.com> wrote:
>> Running out of disk space with fsync disabled won't lead to corruption.
>> Even kill -9 the JRE process with fsync disabled won't corrupt.
>> In these cases index just falls back to last successful commit.
>>
>> It's "only" power loss / OS / machine crash where you need fsync to
>> avoid possible corruption (corruption may not even occur w/o fsync if
>> you "get lucky").
>
> Sorry to disappoint you, but running out of disk space is worse than kill -9.
> You can write down the file (to cache in fact), close it, all without
> getting any
> exceptions. And then it won't get flushed to disk because the disk is full.
> This can happen to segments file (and old one is deleted with default deletion
> policy). This can happen to fat freq/prox files mentioned in segments file
> (and yeah, the old segments file is deleted, so no falling back).

No, this doesn't make sense.  The OS detects a disk full on accepting
the write into the write cache, not [later] on flushing the write
cache to disk.  If the OS accepts the write, then disk is not full (ie
flushing the cache will succeed, unless some other not-disk-full
problem happens).

Hmmm, at least, normally.  What OS/IO system were you on when you saw
corruption due to disk full when fsync is disabled?

>> What if your background thread simply committed every couple of minutes?
>> What's the difference between taking the snapshot (which means you had
>> to call commit previously) and commit it, to call iw.commit by a backgroud merge?
> --
>> But: why do you need to commit so often?
> To see stuff on reopen? Yes, I know about NRT.
>
>> You've reinvented autocommit=true!
> ?? I'm doing regular commits, syncing down every Nth of it.
>
>> Doesn't this just BG the syncing?  Ie you could make a dedicated
>> thread to do this.
>
> Yes, exactly, this BGs the syncing to a dedicated thread. Threads
> doing indexation/merging can continue unhampered.

OK.  Or you can index with N+1 threads, and each indexer thread does
the commit if it's time...

>> One possible win with this aproach is.... the cost of fsync should go
>> way down the longer you wait after writing bytes to the file and
>> before calling fsync.  This is because typically OS write caches
>> expire by time (eg 30 seconds) so if you want long enough the bytes
>> will already at least be delivered to the IO system (but the IO system
>> can do further caching which could still take time).  On windows at
>> least I definitely noticed this effect -- wait some before fync'ing
>> and it's net/net much less costly.
>
> Yup. In fact you can just hold on to the latest commit for N seconds,
> than switch to the new latest commit.
> OS will fsync everything for you.

You're mixing up terminology a bit here -- you can't "hold on to the
latest commit then switch to it".  A commit (as sent to the deletion
policy) means a *real* commit (ie, IW.commit or IW.close was called).
So I think your BG thread would simply be calling IW.commit every N
seconds?

> I'm just playing around with stupid idea. I'd like to have NRT
> look-alike without binding readers and writers. :)

I see... well binding durability & visibility will always be costly.
This is why Lucene decouples them (by making NRT readers available).

> Right now it's probably best for me to save my time and cut over to current NRT.
> But. An important lesson was learnt - no fsyncing blows up your index
> on out-of-disk-space.

I'm still skeptical that disk full even with fsync disabled can lead
to corruption.... I'd like to see some concrete proof :)

BTW, if you know your OS/IO system always persists cached writes w/in
N seconds, a safe way to avoid fsync is to use a by-time expiring
deletion policy.  Ie, a commit stays alive as long as its age is less
than X... DP's unit test has such a policy.  But you better really
know for sure that the OS/IO system guarantee that :)

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Earwin Burrfoot <ea...@gmail.com>.

> Running out of disk space with fsync disabled won't lead to corruption.
> Even kill -9 the JRE process with fsync disabled won't corrupt.
> In these cases index just falls back to last successful commit.
>
> It's "only" power loss / OS / machine crash where you need fsync to
> avoid possible corruption (corruption may not even occur w/o fsync if
> you "get lucky").

Sorry to disappoint you, but running out of disk space is worse than kill -9.
You can write down the file (to cache in fact), close it, all without
getting any
exceptions. And then it won't get flushed to disk because the disk is full.
This can happen to segments file (and old one is deleted with default deletion
policy). This can happen to fat freq/prox files mentioned in segments file
(and yeah, the old segments file is deleted, so no falling back).

> What if your background thread simply committed every couple of minutes?
> What's the difference between taking the snapshot (which means you had
> to call commit previously) and commit it, to call iw.commit by a backgroud merge?
--
> But: why do you need to commit so often?
To see stuff on reopen? Yes, I know about NRT.

> You've reinvented autocommit=true!
?? I'm doing regular commits, syncing down every Nth of it.

> Doesn't this just BG the syncing?  Ie you could make a dedicated
> thread to do this.
Yes, exactly, this BGs the syncing to a dedicated thread. Threads
doing indexation/merging can continue unhampered.

> One possible win with this aproach is.... the cost of fsync should go
> way down the longer you wait after writing bytes to the file and
> before calling fsync.  This is because typically OS write caches
> expire by time (eg 30 seconds) so if you want long enough the bytes
> will already at least be delivered to the IO system (but the IO system
> can do further caching which could still take time).  On windows at
> least I definitely noticed this effect -- wait some before fync'ing
> and it's net/net much less costly.
Yup. In fact you can just hold on to the latest commit for N seconds,
than switch to the new latest commit.
OS will fsync everything for you.


I'm just playing around with stupid idea. I'd like to have NRT
look-alike without binding readers and writers. :)
Right now it's probably best for me to save my time and cut over to current NRT.
But. An important lesson was learnt - no fsyncing blows up your index
on out-of-disk-space.

-- 
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Getting fsync out of the loop

Posted by Michael McCandless <lu...@mikemccandless.com>.

On Tue, Apr 6, 2010 at 10:11 AM, Earwin Burrfoot <ea...@gmail.com> wrote:

> So, I want to pump my IndexWriter hard and fast with documents.

Nice.

> Removing fsync from FSDirectory helps. But for that I pay with possibility of
> index corruption, not only if my node suddenly loses
> power/kernelpanics, but also if it
> runs out of disk space (which happens more frequently).

Running out of disk space with fsync disabled won't lead to corruption.

Even kill -9 the JRE process with fsync disabled won't corrupt.

In these cases index just falls back to last successful commit.

It's "only" power loss / OS / machine crash where you need fsync to
avoid possible corruption (corruption may not even occur w/o fsync if
you "get lucky").

But: why do you need to commit so often?

> I invented the following solution:
> We write a special deletion policy that resembles SnapshotDeletionPolicy.
> At all times it takes hold of "current synced commit" and preserves
> it. Once every N minutes
> a special thread takes latest commit, syncs it and nominates as
> "current synced commit". The
> previous one gets deleted.
>
> Now we are disastery-proof, and do fsync asynchronously from indexing
> threads. We pay for this with
> somewhat bigger transient disc usage, and probably losing a few
> minutes worth of updates in
> case of a crash, but that's acceptable.
>
> How does this sound?

You've reinvented autocommit=true!

Doesn't this just BG the syncing?  Ie you could make a dedicated
thread to do this.

One possible win with this aproach is.... the cost of fsync should go
way down the longer you wait after writing bytes to the file and
before calling fsync.  This is because typically OS write caches
expire by time (eg 30 seconds) so if you want long enough the bytes
will already at least be delivered to the IO system (but the IO system
can do further caching which could still take time).  On windows at
least I definitely noticed this effect -- wait some before fync'ing
and it's net/net much less costly.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org