You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Adam Kocoloski <ad...@gmail.com> on 2010/04/12 04:54:26 UTC
ensuring an update_seq is used at most once
Currently a DB update_seq can be reused if there's a power failure before the header is sync'ed to disk. This adds some extra complexity and overhead to the replicator, which must confirm before saving a checkpoint that the source update_seq it is recording will not be reused later. It does this by issuing an ensure_full_commit call to the source DB, which may be a pretty expensive operation if the source has a constant write load.
Should we try to fix that? One way to do so would be start at a significantly higher update_seq than the committed one whenever the DB is opened after an "unclean" shutdown; that is, one where the DB header is not the last term stored in the file. Although, I suppose that's not an ironclad test for data loss -- it might be the case that none of the lost updates were written to the file. I suppose we could "bump" the update_seq on every startup.
Adam
Re: ensuring an update_seq is used at most once
Posted by Adam Kocoloski <ko...@apache.org>.
Yep. Db#db.update_seq is not always the same as Db#db.committed_update_seq when delayed_commits are on.
Adam
On Apr 12, 2010, at 8:54 AM, Paul Davis wrote:
> An idle curiosity, is it ever possible to replicate something that has
> been written to disk before a header is flushed?
>
>
> On Mon, Apr 12, 2010 at 8:46 AM, Adam Kocoloski <ko...@apache.org> wrote:
>> Yep, your analysis is dead-on, and is a more complete solution than what I propose. Best,
>>
>> Adam
>>
>> On Apr 12, 2010, at 4:51 AM, Robert Newson wrote:
>>
>>> Would it be safer to have a low- and high- watermark for the
>>> update_seq in memory? What I mean is that the db writer will never
>>> write out an update_seq that is N higher than the last committed one;
>>> if it is forced to do so, to permit a write, it then fsync's and
>>> resets high_seq to last_committed_seq. This way you can genuinely
>>> ensure that you don't reuse an update_seq. In practice we could allow
>>> a large delta, one that is larger than the number of fsyncs we expect
>>> to manage in the commit interval.
>>>
>>> Your idea to just bump the update_seq "significantly" mostly pans out
>>> (I know a system that does precisely this) but it would be a data loss
>>> scenario if when it doesn't pan out.
>>>
>>> B.
>>>
>>> On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski
>>> <ad...@gmail.com> wrote:
>>>> Currently a DB update_seq can be reused if there's a power failure before the header is sync'ed to disk. This adds some extra complexity and overhead to the replicator, which must confirm before saving a checkpoint that the source update_seq it is recording will not be reused later. It does this by issuing an ensure_full_commit call to the source DB, which may be a pretty expensive operation if the source has a constant write load.
>>>>
>>>> Should we try to fix that? One way to do so would be start at a significantly higher update_seq than the committed one whenever the DB is opened after an "unclean" shutdown; that is, one where the DB header is not the last term stored in the file. Although, I suppose that's not an ironclad test for data loss -- it might be the case that none of the lost updates were written to the file. I suppose we could "bump" the update_seq on every startup.
>>>>
>>>> Adam
>>>>
>>>>
>>
>>
Re: ensuring an update_seq is used at most once
Posted by Paul Davis <pa...@gmail.com>.
An idle curiosity, is it ever possible to replicate something that has
been written to disk before a header is flushed?
On Mon, Apr 12, 2010 at 8:46 AM, Adam Kocoloski <ko...@apache.org> wrote:
> Yep, your analysis is dead-on, and is a more complete solution than what I propose. Best,
>
> Adam
>
> On Apr 12, 2010, at 4:51 AM, Robert Newson wrote:
>
>> Would it be safer to have a low- and high- watermark for the
>> update_seq in memory? What I mean is that the db writer will never
>> write out an update_seq that is N higher than the last committed one;
>> if it is forced to do so, to permit a write, it then fsync's and
>> resets high_seq to last_committed_seq. This way you can genuinely
>> ensure that you don't reuse an update_seq. In practice we could allow
>> a large delta, one that is larger than the number of fsyncs we expect
>> to manage in the commit interval.
>>
>> Your idea to just bump the update_seq "significantly" mostly pans out
>> (I know a system that does precisely this) but it would be a data loss
>> scenario if when it doesn't pan out.
>>
>> B.
>>
>> On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski
>> <ad...@gmail.com> wrote:
>>> Currently a DB update_seq can be reused if there's a power failure before the header is sync'ed to disk. This adds some extra complexity and overhead to the replicator, which must confirm before saving a checkpoint that the source update_seq it is recording will not be reused later. It does this by issuing an ensure_full_commit call to the source DB, which may be a pretty expensive operation if the source has a constant write load.
>>>
>>> Should we try to fix that? One way to do so would be start at a significantly higher update_seq than the committed one whenever the DB is opened after an "unclean" shutdown; that is, one where the DB header is not the last term stored in the file. Although, I suppose that's not an ironclad test for data loss -- it might be the case that none of the lost updates were written to the file. I suppose we could "bump" the update_seq on every startup.
>>>
>>> Adam
>>>
>>>
>
>
Re: ensuring an update_seq is used at most once
Posted by Adam Kocoloski <ko...@apache.org>.
Yep, your analysis is dead-on, and is a more complete solution than what I propose. Best,
Adam
On Apr 12, 2010, at 4:51 AM, Robert Newson wrote:
> Would it be safer to have a low- and high- watermark for the
> update_seq in memory? What I mean is that the db writer will never
> write out an update_seq that is N higher than the last committed one;
> if it is forced to do so, to permit a write, it then fsync's and
> resets high_seq to last_committed_seq. This way you can genuinely
> ensure that you don't reuse an update_seq. In practice we could allow
> a large delta, one that is larger than the number of fsyncs we expect
> to manage in the commit interval.
>
> Your idea to just bump the update_seq "significantly" mostly pans out
> (I know a system that does precisely this) but it would be a data loss
> scenario if when it doesn't pan out.
>
> B.
>
> On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski
> <ad...@gmail.com> wrote:
>> Currently a DB update_seq can be reused if there's a power failure before the header is sync'ed to disk. This adds some extra complexity and overhead to the replicator, which must confirm before saving a checkpoint that the source update_seq it is recording will not be reused later. It does this by issuing an ensure_full_commit call to the source DB, which may be a pretty expensive operation if the source has a constant write load.
>>
>> Should we try to fix that? One way to do so would be start at a significantly higher update_seq than the committed one whenever the DB is opened after an "unclean" shutdown; that is, one where the DB header is not the last term stored in the file. Although, I suppose that's not an ironclad test for data loss -- it might be the case that none of the lost updates were written to the file. I suppose we could "bump" the update_seq on every startup.
>>
>> Adam
>>
>>
Re: ensuring an update_seq is used at most once
Posted by Robert Newson <ro...@gmail.com>.
Would it be safer to have a low- and high- watermark for the
update_seq in memory? What I mean is that the db writer will never
write out an update_seq that is N higher than the last committed one;
if it is forced to do so, to permit a write, it then fsync's and
resets high_seq to last_committed_seq. This way you can genuinely
ensure that you don't reuse an update_seq. In practice we could allow
a large delta, one that is larger than the number of fsyncs we expect
to manage in the commit interval.
Your idea to just bump the update_seq "significantly" mostly pans out
(I know a system that does precisely this) but it would be a data loss
scenario if when it doesn't pan out.
B.
On Mon, Apr 12, 2010 at 3:54 AM, Adam Kocoloski
<ad...@gmail.com> wrote:
> Currently a DB update_seq can be reused if there's a power failure before the header is sync'ed to disk. This adds some extra complexity and overhead to the replicator, which must confirm before saving a checkpoint that the source update_seq it is recording will not be reused later. It does this by issuing an ensure_full_commit call to the source DB, which may be a pretty expensive operation if the source has a constant write load.
>
> Should we try to fix that? One way to do so would be start at a significantly higher update_seq than the committed one whenever the DB is opened after an "unclean" shutdown; that is, one where the DB header is not the last term stored in the file. Although, I suppose that's not an ironclad test for data loss -- it might be the case that none of the lost updates were written to the file. I suppose we could "bump" the update_seq on every startup.
>
> Adam
>
>