You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/01/04 15:37:09 UTC

[GitHub] jvrao commented on a change in pull request #847: BP-23: Ledger Balancer (WIP)

jvrao commented on a change in pull request #847: BP-23: Ledger Balancer (WIP)
URL: https://github.com/apache/bookkeeper/pull/847#discussion_r159681081
 
 

 ##########
 File path: site/bps/BP-23-ledger-rebalancer.md
 ##########
 @@ -0,0 +1,50 @@
+---
+title: "BP-23: ledger balancer"
+issue: https://github.com/apache/bookkeeper/846
+state: "WIP" 
+release: "x.y.z"
+---
+
+### Motivation
+
+There are typical two use cases of _Apache BookKeeper_, one is *Messaging/Streaming/Logging* style use cases, the other one is *Storage* style use cases.
+
+In Messaging/Streaming/Logging oriented use case (where old ledgers/segments are most likely will be deleted at some point), we don't actually need to rebalance the ledgers stored on bookies.
+
+However,
+In Storage oriented use cases (where data most likely will never be deleted), BookKeeper data might not always be placed uniformly across bookies. One common reason is addition of new bookies to an existing cluster. This proposal is proposing to provide a balancer mechanism (as an utility, also as part of AutoRecovery daemon), that analyzes ledger distributions and balances ledgers across bookies.
+
+It replicated ledgers to new bookies (based on resource-aware placement policies) until the cluster is deemed to be balanced, which means that disk utilization of every bookie (ratio of used space on the node to the capacity of the node) differs from the utilization of the cluster (ratio of used space on the cluster to total capacity of the cluster) by no more than a given threshold percentage.
 
 Review comment:
   @sijie I would request this to be a separate doc, when we come out with a design. 
   If we are *not* changing the code, and this proposal is to create a stand-alone tool, then we can surely
   work with md file. On the other hand if we need to change replication logic, it makes great sense to have
   a separate design doc. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services