You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Flavio Junqueira (Created) (JIRA)" <ji...@apache.org> on 2011/11/15 19:22:51 UTC

[jira] [Created] (BOOKKEEPER-112) Fix exception todo in BookieRecoveryTest

Fix exception todo in BookieRecoveryTest
----------------------------------------

                 Key: BOOKKEEPER-112
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
             Project: Bookkeeper
          Issue Type: Improvement
            Reporter: Flavio Junqueira
            Assignee: Ivan Kelly
             Fix For: 4.0.0


Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Kelly updated BOOKKEEPER-112:
----------------------------------

    Assignee: Sijie Guo  (was: Ivan Kelly)
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213661#comment-13213661 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-02-22 14:43:04.529989)


Review request for bookkeeper.


Changes
-------

attach a new patch to remove excludedBookies related codes. 


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment:     (was: BK-112.patch)
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234591#comment-13234591 ] 

Flavio Junqueira commented on BOOKKEEPER-112:
---------------------------------------------

There are two occasions during the lifetime of a ledger that we use consensus through zookeeper: when we change the ensemble and when we close the ledger. By design, the former is only proposed by the writer, whereas the latter can be proposed by either the writer or another client trying to recover it. Trying to change the design so that we can have multiple clients proposing changes to the ensemble of a ledger would be difficult and prone to errors, so I suggest we keep this part of the design the way it is.

One way to perform the fencing for recovery and still keep the original design as is with respect to ensemble changes is to wait for the writer to mark in the ledger metadata such a change. Say that we externally detect that a bookie C has crashed. If the writer of a given ledger L removes C from its configuration and writes to ZooKeeper, then we can safely recover the ledger fragment of C for L. If the writer of L never makes such a change, then we assume that the writer can still talk to C, and consequently we don't care. 

We can monitor ensemble changes by watching the node in ZooKeeper. Using watches will make it difficult to implement an HBase backend for metadata as we proposed in another jira.

 
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BK-112.patch
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234262#comment-13234262 ] 

Ivan Kelly commented on BOOKKEEPER-112:
---------------------------------------


Flavio and I discussed this a little yesterday evening and after thinking about it for a little bit afterwards, the problem seems clearer to me now.

So, what is under discussion here is what do we do with the final fragment of an open ledger. This actually boils down to the same problem we have for fencing. By recovering the bookie, we are introducing a second writer, violating our 1-writer assumption. Since we now have more than one writer, it is necessary for there to be a consensus among all writers on where the ledger fragment ends. 

There are 3 situations which this open final fragment can occur.
 # the original writer crashed, then bookie crashed before ledger recovery
 # the original writer has the ledger open, but has not written anything since the bookie crashed
 # the bookie being recovered isn't actually down

One solution proposed by Flavio yesterday was that we should wait until no open final fragments exist before updating the ZK metadata. This works for 2. However, for 1 & 3, the recovery will wait forever.

One way im leaning towards now, is to replicate all entries in the fragment, and then ensure that no more entries are added to this specific fragment. This would require a change to how fencing works. Instead of fencing by ledger id, we would have to fence by fragment id. When the original writer tries to write, the write will fail, and then try to replace the bookie to which the write failed to write to (all bookies in this case). This deals with 1, because all entries written before the writer crash will be replicated. It works for 2, because the next write by the writer will see that its current ledger fragment is fenced *and* that the crashed bookie is down, so it will build a new ensemble and start writing a new fragment. It deals with 3, as the current fragment will be rereplicated and any further attempts by the writer will force it to rebuild its ensemble.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184882#comment-13184882 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

Review request for bookkeeper.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 8de20c9 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java c353e46 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 80e46b9 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java e1e8449 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v5
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220815#comment-13220815 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

> Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment.

I am doubting why B1 is still being written. If BookieRecoveryTool is run to re-replicate entries of B1, the admin guy would ensure B1 is down then do re-replication. If we want to ensure B1 is not alive, maybe we can do some checking before re-replication, like checking whether B1 in available list or not, trying setup connection and  do some operations.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219984#comment-13219984 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review5494
-----------------------------------------------------------



bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
<https://reviews.apache.org/r/3472/#comment11911>

    Why not use the SafeOrderedExecutor as BookKeeper does?



bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
<https://reviews.apache.org/r/3472/#comment11912>

    bitwise &, I guess you meant && here.



bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
<https://reviews.apache.org/r/3472/#comment11913>

    I dont like boolean flags like readForward being passed to the constructor here. I think it would be better to have a class called SingleFragmentCallbackWithForwardRead which inherits from SingleFragmentCallback. You can then overload processResult on that.


- Ivan


On 2012-02-28 11:09:19, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-02-28 11:09:19)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13156707#comment-13156707 ] 

Flavio Junqueira commented on BOOKKEEPER-112:
---------------------------------------------

I'm not sure I have a good way to guarantee that it always happens this way, but my feeling is that we shouldn't try to recover a bookie that participated in a ledger ensemble for which the ledger is still open. 

One way is to check that all ledger fragments to recover are from ledgers that have been closed already. We can do it by checking the ledger metadata stored in zookeeper. Bookie recovery proceeds only if all ledger are closed. 
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.0.0
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241187#comment-13241187 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review6515
-----------------------------------------------------------



bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
<https://reviews.apache.org/r/3472/#comment14178>

    Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future.


- Ivan


On 2012-03-23 15:30:15, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-23 15:30:15)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v7

new patch to close opened non-recovery ledger handle as Ivan's suggestion.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v4

new patch addressed Ivan's comment.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215745#comment-13215745 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-02-24 17:09:54.127303)


Review request for bookkeeper.


Changes
-------

discussed with Ivan offline, it seems that it is OK to recover those opened / in recovery status ledgers. for those in recovery status ledgers, if the last entry on the failed bookie, the recovery add should change ensemble. the last entry would be written on the other bookies. the tricky case is for opened ledgers, since readEntries for openNoRecovery can't read more than lastAddConfirmed, we change ledgerHandle#readEntries to use PendingReadOp to skip boundary checking to try to read lastAddConfirmed + 1 to avoid missing replicating this entry.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Kelly updated BOOKKEEPER-112:
----------------------------------

    Description: 
Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. 

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

  was:Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes()

     Issue Type: Bug  (was: Improvement)
        Summary: Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail  (was: Fix exception todo in BookieRecoveryTest)
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.0.0
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. 
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241189#comment-13241189 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------



bq.  On 2012-03-29 12:44:18, Ivan Kelly wrote:
bq.  > bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java, line 444
bq.  > <https://reviews.apache.org/r/3472/diff/7/?file=94988#file94988line444>
bq.  >
bq.  >     Perhaps we should close lh here. It's a noop really, as a non recovery lh is a ReadOnlyLedgerHandle, where noop is just a stub. We should close it anyhow, just in case it changes to not being a noop in the future.

otherwise the patch looks good to me. +1


- Ivan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review6515
-----------------------------------------------------------


On 2012-03-23 15:30:15, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-03-23 15:30:15)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v3
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219969#comment-13219969 ] 

Ivan Kelly commented on BOOKKEEPER-112:
---------------------------------------

Another corner case came to me while reading through this. What happens if a client has a ledger open and is writing to it. The srcBookie(B1) isn't actually failed, so the client can continue to write to it. The recovery comes in, copies all the entries(up to entry X), and updates the metadata, but the client keep writing entries(up to entry Y), unaware of the recovery process. Anything between X & Y would be underreplicated, as B1 is no longer in the ensemble for the fragment. Im not sure what the best course of action would be in this case, maybe we can force an ensemble change, or force the ledger closed.  


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234418#comment-13234418 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

yeah, fencing by fragment id is a very good solution, although I am not so clear how fencing handles following case.

we have 5 bookies, bk[1-5], suppose bk5 is down, ledger l is opened, with last fragment is bk3, bk4, bk5. the recovery tool fence bk3 & bk4, further attempts from ledger l will force it to rebuild new ensemble, suppose (bk1, bk2, bk3). bk3 is a common bookie between the old ensemble and the new ensemble. how to deal with writing to such bookie? because bookie server has no knowledge about ledger distribution info, it doesn't know the writing is to an old ensemble or to a new ensemble. unless we also send fragment id in the addEntry request.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220818#comment-13220818 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

> There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger.

If we don't bookie recovery on open ledgers, the user may find each time he run BookieRecoveryTool it would not success since there was opened ledgers not being recovered. The user may have no idea when to stop running this tool.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243202#comment-13243202 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

oh, sorry Flavio. I may miss some words on the sync up meeting. 
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Junqueira updated BOOKKEEPER-112:
----------------------------------------

    Attachment: bk-112.pdf

The attached document reflects the discussion Sijie, Ivan, and myself had on the design of bookie recovery to solve the issue described in this jira.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Junqueira updated BOOKKEEPER-112:
----------------------------------------

    Attachment: bk-112.pdf
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BK-112.patch

Attach a patch.

The idea is when ledger handle encounters BadVersion metadata, it should reread a updated metadata, and compare it with the old one to check whether it can resolve the confliction or not. 

I will put this patch on review board for discussion. 
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v6

attach a new patch fixed according to the document.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243091#comment-13243091 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

committed as r1307743. thanks Ivan, Flavio for discussion. thanks Ivan for reviewing.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Kelly updated BOOKKEEPER-112:
----------------------------------

    Fix Version/s:     (was: 4.0.0)
                   4.1.0

Moved to 4.1.0, as fixing this will be part of a large review of bookie recovery.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220843#comment-13220843 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-03-02 11:27:34.221927)


Review request for bookkeeper.


Changes
-------

new patch remove readForward flag from constructor. and also remove ScheduledExecutorService, since we don't need to care about long-chain, since the callback will be triggered in bookie client's netty thread not in same thread. 


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236518#comment-13236518 ] 

Sijie Guo commented on BOOKKEEPER-112:
--------------------------------------

I would modify the patch according to the document ASAP.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v2
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236678#comment-13236678 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-03-23 15:30:15.796027)


Review request for bookkeeper.


Changes
-------

modify the patch according to the document.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215762#comment-13215762 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review5324
-----------------------------------------------------------



bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
<https://reviews.apache.org/r/3472/#comment11620>

    I don't think going up to lastAddConfirmed + 1 is enough. lastAddConfirmed can trail behind the last written entry by an undefined amount. For example, if we asyncAdd 10 entries, they will each have the same lastAddConfirmed in the packet if the 10th is sent before the 1st is acknowledged.
    
    The main problem is that bookie recovery reads each entry in parallel, while for this last bit, we need to read sequentially. I think after each ledger fragment successfully completes, this should kick off another operation, which steps up from lastAddConfirmed until it gets NoSuchEntry.
    


- Ivan


On 2012-02-24 17:09:54, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-02-24 17:09:54)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218069#comment-13218069 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-02-28 11:09:19.006294)


Review request for bookkeeper.


Changes
-------

adding a readforward operation if lastEnsemble need to be repaired, which addresses Ivan's comment.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java a94a0e5 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 99258ac 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BaseTestCase.java dada67a 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242546#comment-13242546 ] 

Ivan Kelly commented on BOOKKEEPER-112:
---------------------------------------

lgtm +1
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch_v8

attach a new patch to add missing line which releasing permit when submitting callback in PeadingReadOp, similar what BOOKKEEPER-186 did.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197889#comment-13197889 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review4745
-----------------------------------------------------------


I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also.

- Ivan


On 2012-01-29 09:58:30, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-29 09:58:30)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195727#comment-13195727 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-01-29 09:58:30.513187)


Review request for bookkeeper.


Changes
-------

we need to check the ledger metadata status before proceed recovery action.

for those OPENED ledgers,
1) whose last ensemble contains the failed bookie, we should not proceed recovery action. since we can't promise last entry to be fully replicated. (also there may be other side effects)
2) whose last ensemble doesn't contain the failed bookie, it is safe to proceed recovery action.

for those IN_RECOVERY ledgers, we have to check whether last ensemble contains the failed bookie. if it is, the recovery tool has to help closing this ledger, since the normal bookkeeper client may fail to close it. (a corn case: 3 bookies (bk1, bk2, bk3), quorum size 3, ensemble size 3. no entry is written. bk3 is failed. bk1 and bk2 returns NoEntry, bk3 returns HandleNotAvailable. ledger can't be closed.) 

for 2) case of OPENED ledgers, both PendingAddOp and BookKeeperAdmin needs to rereadMetadata when encountering BADVERSION and try to resolve such confliction to avoid #close it.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243097#comment-13243097 ] 

Hudson commented on BOOKKEEPER-112:
-----------------------------------

Integrated in bookkeeper-trunk #438 (See [https://builds.apache.org/job/bookkeeper-trunk/438/])
    BOOKKEEPER-112: Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail (sijie) (Revision 1307743)

     Result = ABORTED
sijie : 
Files : 
* /zookeeper/bookkeeper/trunk/CHANGES.txt
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java
* /zookeeper/bookkeeper/trunk/bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java

                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198462#comment-13198462 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------



bq.  On 2012-02-01 15:01:34, Ivan Kelly wrote:
bq.  > I had a quick look over this. The exclude bookies stuff needs to be removed since BOOKKEEPER-23 gets rid of the need for it & it exposes internal details through the public api which is bad. The other fix in here looks ok, but I found it hard to pick it out with the excludeBookies stuff in there also.

 if the failed bookie existed in the quorum set which maxAddConfirmed entry belongs to, readLastConfirmed could not succeed, either open and openNoRecovery would fail. (this issue has been reported in BOOKKEEPER-152, but it seems that it is hard to separate into two patches. so I put them in this patch.)

so recovery tool needs failed bookies information as hints to tell readLastConfirmed to skip failed bookies. I think to change excludedBookies related api to protected, not expose to public.

what is your opinion?


- Sijie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/#review4745
-----------------------------------------------------------


On 2012-01-29 09:58:30, Sijie Guo wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3472/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-01-29 09:58:30)
bq.  
bq.  
bq.  Review request for bookkeeper.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
bq.  
bq.  Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
bq.  Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()
bq.  
bq.  
bq.  This addresses bug BOOKKEEPER-112.
bq.      https://issues.apache.org/jira/browse/BOOKKEEPER-112
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeper.java 5bb37c3 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java 547e240 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java 56186ab 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java 4625bbb 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 29070eb 
bq.    bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java 43e999d 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java 8526db5 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 015e4e4 
bq.    bookkeeper-server/src/test/java/org/apache/bookkeeper/test/LedgerOpenTest.java PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3472/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Sijie
bq.  
bq.


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Ivan Kelly (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Kelly updated BOOKKEEPER-112:
----------------------------------

    Description: 
Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

  was:
Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. 

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done

    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.0.0
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Sijie Guo (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-112:
---------------------------------

    Attachment: BOOKKEEPER-112.patch
    
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Ivan Kelly
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243122#comment-13243122 ] 

Flavio Junqueira commented on BOOKKEEPER-112:
---------------------------------------------

Hi Sijie, I'm not done reviewing it and I'd like to before having it in. Is it ok?
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241208#comment-13241208 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-03-29 13:29:23.834764)


Review request for bookkeeper.


Changes
-------

close opened non-recovery ledger handle as Ivan's suggestion.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242288#comment-13242288 ] 

jiraposter@reviews.apache.org commented on BOOKKEEPER-112:
----------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3472/
-----------------------------------------------------------

(Updated 2012-03-30 12:12:54.459832)


Review request for bookkeeper.


Changes
-------

adding one line to release permit when submit callback in PendingReadOp. similar as BOOKKEEPER-186.


Summary
-------

Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.

Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()


This addresses bug BOOKKEEPER-112.
    https://issues.apache.org/jira/browse/BOOKKEEPER-112


Diffs (updated)
-----

  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/BookKeeperAdmin.java 37623dc 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java f71e53f 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerMetadata.java b403aa1 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerRecoveryOp.java c67a79c 
  bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java 539d6b2 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/BookieRecoveryTest.java b8923e8 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/client/TestFencing.java 7de1c10 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/BookKeeperClusterTestCase.java 0b882c6 
  bookkeeper-server/src/test/java/org/apache/bookkeeper/test/CloseTest.java e28d32c 

Diff: https://reviews.apache.org/r/3472/diff


Testing
-------


Thanks,

Sijie


                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4, BOOKKEEPER-112.patch_v5, BOOKKEEPER-112.patch_v6, BOOKKEEPER-112.patch_v7, BOOKKEEPER-112.patch_v8, bk-112.pdf, bk-112.pdf
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-112) Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail

Posted by "Flavio Junqueira (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220018#comment-13220018 ] 

Flavio Junqueira commented on BOOKKEEPER-112:
---------------------------------------------

There is a distinction between bookie recovery and ledger recovery. I don't think we should do bookie recovery on an open ledger.
                
> Bookie Recovery on an open ledger will cause LedgerHandle#close on that ledger to fail
> --------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-112
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-112
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Flavio Junqueira
>            Assignee: Sijie Guo
>             Fix For: 4.1.0
>
>         Attachments: BK-112.patch, BOOKKEEPER-112.patch, BOOKKEEPER-112.patch_v2, BOOKKEEPER-112.patch_v3, BOOKKEEPER-112.patch_v4
>
>
> Bookie recovery updates the ledger metadata in zookeeper. LedgerHandle will not get notified of this update, so it will try to write out its own ledger metadata, only to fail with KeeperException.BadVersion. This effectively fences all write operations on the LedgerHandle (close and addEntry). close will fail for obvious reasons. addEntry will fail once it gets to the failed bookie in the schedule, tries to write, fails, selects a new bookie and tries to update ledger metadata.
> Update Line 605, testSyncBookieRecoveryToRandomBookiesCheckForDupes(), when done
> Also, uncomment addEntry in TestFencing#testFencingInteractionWithBookieRecovery()

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira