You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@bookkeeper.apache.org by "hangc0276 (via GitHub)" <gi...@apache.org> on 2023/03/27 04:15:27 UTC

[GitHub] [bookkeeper] hangc0276 opened a new issue, #3887: Bookkeeper decommission may be blocked by ledgers that cannot be replicated

hangc0276 opened a new issue, #3887:
URL: https://github.com/apache/bookkeeper/issues/3887

   ### Motivation
   Current bookie decommission process
     - Step 1: Use the command `bin/bookkeeper shell listunderreplicated` to check whether there are some ledgers in the under-replicated state
     - Step 2: After all the ledgers are replicated complete, stop the bookie and use the command `bin/bookkeeper shell decommissionbookie -bookieid <bookieaddress>` to trigger decommission
     - Step 3: Wait for all the ledgers replicated complete, the bookie decommission process complete
   
   However, there is a bug in the decommissioning process.
   
   In Step 1, those under-replicated state ledgers are marked by the following steps:
     - Auditor check lost bookie: it will be triggered by two cases: a) One bookie lost after `lostBookieRecoveryDelay`, b) Check every `auditorPeriodicBookieCheckInterval`, default is 24 hours.
     - Auditor checks all ledgers: triggered every `auditorPeriodicCheckInterval`, default is 7 days. It will check every ledger's fragments with the following steps:
       - For every fragment, calculate pending read entries according to `auditorLedgerVerificationPercentage`, default is `0`, which means only checking the first and last entries of this fragment. 
       - For the entries in the pending read entries list, read those entries from all the bookies in the ensemble list. If any entries read failed, mark the ledger into an under-replicated state.
   
   
   When we use the `bin/bookkeeper shell listunderreplicated` command to check whether there are some ledgers in the under-replicated state, it only represents those ledgers missing replicas before the last check, for Auditor lost bookie check, it is 24 hours ago, and for Auditor all ledgers check, it is 7 days ago. For the time range from the last check timestamp to the current timestamp, it won't mark any missing replicas ledgers. If we set EnsembleSize=3, WriteQuorumSize=2, and AckQuorumSize=1, and decommission one bookie with the current decommission process, it may result in some ledgers can't be replicated due to the only one available replica located on the decommissioned bookie.
   
   What's more, the Auditor checks all ledgers operation only checks the first and last entries of each fragment of those ledgers. If the bookie disabled writing journals and some entries are lost in one fragment but the first and last entries still exist, the checker won't find it.
   
   ### Options
   There are two options to tune the decommissioning process.
   
   1. Trigger check all ledgers before Step 1. It has the following disadvantages.
      - It will cost a lot of resources
      - It only checks the first and last entries of each fragment of those ledgers by default, it can't cover all the entries check
   
    2. Turn the bookie into read-only mode instead of shutting it down before using the `bin/bookkeeper shell decommissionbookie -bookieid <bookieaddress>` command to trigger commission. When replicating ledgers that are located on the decommission bookie, the ledgers can be replicated successfully if one replica is avaiable.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Bookkeeper decommission may be blocked by ledgers that cannot be replicated [bookkeeper]

Posted by "yunbai002 (via GitHub)" <gi...@apache.org>.
yunbai002 commented on issue #3887:
URL: https://github.com/apache/bookkeeper/issues/3887#issuecomment-1823875951

   @hangc0276  Is there any plan to fix this bug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@bookkeeper.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org