You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org> on 2012/03/01 12:42:04 UTC

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219969#comment-13219969 ] 

Ivan Kelly commented on BOOKKEEPER-112:
---------------------------------------

Another corner case came to me while reading through this. What happens if a client has a ledger open and is writing to it. The srcBookie(B1) isn't actually failed, so the client can continue to write to it. The recovery comes in, copies all the entries(up to entry X), and updates the metadata, but the client keep writing entries(up to entry Y), unaware of the recovery process. Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment. Im not sure what the best course of action would be in this case, maybe we can force an ensemble change, or force the ledger closed.  


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira