You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Sijie Guo (JIRA)" <ji...@apache.org> on 2013/04/10 02:38:15 UTC

[jira] [Comment Edited] (BOOKKEEPER-572) Make the journal a write ahead log

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627125#comment-13627125 ] 

Sijie Guo edited comment on BOOKKEEPER-572 at 4/10/13 12:37 AM:
----------------------------------------------------------------

when I raised the reverse add sequence idea in BOOKKEEPER-447, a problem bothered me for a period: what happened if we add a buggy record successfully in journal, but fail to apply it into ledger storage? it looks like the buggy record would alive in journal forever, since journal replaying would end up throwing exception when applying this buggy record. it is bad if this happened. And I think this problem is still not addressed in current patch. I am not sure is it a big problem for now or even in future. 

this issue is kind of partial-update, it would happen in all the update-in-place systems either database or filesystem. in current bookie journal, we don't have enough information to undo the operation when recovery to bring storage back to a consistent state. that is why we discussed a lot for BOOKKEEPER-447.

Another way to address this issue is COW(copy-on-write), which is that ZFS and Btrfs used to address inconsistency problem in metadata & data. since updates would not be applied in-place, it is easy to keep consistency by rolling back to the state in previous checkpoint.  I was starting some hack works on implementing a log-structured ledger index, which 1) make index page written in more sequential way 2) avoid updating index page in-place, which might also addressed the issue in BOOKKEEPER-447 more gracefully.


                
      was (Author: hustlmsp):
    when I raised the reverse add sequence idea in BOOKKEEPER-447, a problem bothered me for a period: what happened if we add a buggy record successfully in journal, but fail to apply it into ledger storage? it looks like the buggy record would alive in journal forever, since journal replaying would end up throwing exception when applying this buggy record. it is bad if this happened. And I think this problem is still not addressed in current patch. I am not sure is it a big problem for now or even in future. 

this issue is kind of partial-update, it would happen in all the update-in-place systems either database or filesystem. usually there is redo log in database, which would bring partial-update pages back to a consistent state when recovery. but the journal in bookie server is not a redo log, so it is difficult to handle this case.

Another way to address this issue is COW(copy-on-write), which is that ZFS and Btrfs used to address inconsistency problem in metadata & data. since updates would not be applied in-place, it is easy to keep consistency by rolling back to the state in previous checkpoint.  I was starting some hack works on implementing a log-structured ledger index, which 1) make index page written in more sequential way 2) avoid updating index page in-place, which might also addressed the issue in BOOKKEEPER-447 more gracefully.


                  
> Make the journal a write ahead log
> ----------------------------------
>
>                 Key: BOOKKEEPER-572
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-572
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 4.3.0
>
>         Attachments: 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 0003-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 0003-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, BookieServer-2013-02-22.snapshot
>
>
> A bookie adds to the LedgerStorage before writing to the journal. This is the fundamental problem behind BOOKKEEPER-447 and blocks a nice solution to BOOKKEEPER-530. By writing to the memory state before the journal, we exposed ourselves to bugs if the bookie crashed before we wrote to the journal. The entry may exist in index, but not in the entrylog, a situation which cannot be distinguished from an I/O error. The comments on BOOKKEEPER-447 goes into more details. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira