You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bookkeeper.apache.org by "hangc0276 (via GitHub)" <gi...@apache.org> on 2023/04/15 04:52:15 UTC

[GitHub] [bookkeeper] hangc0276 commented on pull request #3917: Fix ledger replicated failed blocks bookie decommission process

hangc0276 commented on PR #3917:
URL: https://github.com/apache/bookkeeper/pull/3917#issuecomment-1509508411

   > > How the lastAddConfirm was generated
   > > Due to the ledger being in the OPEN state, the ledger handle will send a readLAC request to get the ledger's lastAddConfirm.
   > > For the above case, if bk1 holds the max entry 14, bk2 holds the max entry 13 and bk3 holds the max entry 14 but it is lost, the LedgerHandle get lastAddConfirm will be 13, not 14.
   > 
   > I have a question about this.
   > 
   > ```
   > public void checkLedger(final LedgerHandle lh,
   >                             final GenericCallback<Set<LedgerFragment>> cb,
   >                             long percentageOfLedgerFragmentToBeVerified)
   > ```
   > 
   > There are two places to invoke LedgerChecker#checkLedger.
   > 
   > 1. https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/AuditorCheckAllLedgersTask.java#L196-L212
   > 
   > 2.
   > 
   > https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/replication/ReplicationWorker.java#L449-L451
   > 
   > Both places use the openLedgerNoRecovery to open the LedgerHandle.
   > 
   > It will use lh.asyncReadLastConfirmed() to get the LAC.
   > 
   > https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerOpenOp.java#L222-L245
   > 
   > And we use the V2 protocol, it invokes `asyncReadPiggybackLastConfirmed`.
   > 
   > https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L1388-L1395
   > 
   > In asyncReadPiggybackLastConfirmed, it invokes ReadLastConfirmedOp#initiate() to get the LAC.
   > 
   > https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L1422-L1428
   > 
   > In the ReadLastConfirmedOp, it will read all bookies in the currentEnsemble. The currentEnsemble is `15=[bk1:3181, bk2:3181, bk4:3181]`. Not the {0=[bk1:3181, bk2:3181, bk3:3181]. So it will send readLac RPC to bk1, bk2, and bk4. Then get response from bk1, bk2, and bk4. When ReadLastConfirmedOp#readEntryComplete, it will compare the response from different bookies, then pick the max lac response to override maxRecoveredData. Then invoke the callback using the maxRecoveredData.
   > 
   > So if the bk1 lac is 14, the ledgerHandle lac should be 14, not 13.
   > 
   > https://github.com/apache/bookkeeper/blob/35e9da9b55b5d44459d3421e8704be47afc6f914/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadLastConfirmedOp.java#L97-L154
   > 
   > line_l08-line_112, it will pick the max lac to override maxRecoveredData.
   > 
   > line_137-line_148, use the maxRecoveredData to invoke callback.
   
   @horizonzy Yes, you are right. I updated the description, please help take a look, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org