You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/07/16 02:29:01 UTC

[GitHub] sijie closed pull request #847: BP-23: Ledger Balancer (WIP)

sijie closed pull request #847: BP-23: Ledger Balancer (WIP)
URL: https://github.com/apache/bookkeeper/pull/847
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/site/bps/BP-23-ledger-rebalancer.md b/site/bps/BP-23-ledger-rebalancer.md
new file mode 100644
index 0000000000..55acc35c33
--- /dev/null
+++ b/site/bps/BP-23-ledger-rebalancer.md
@@ -0,0 +1,50 @@
+---
+title: "BP-23: ledger balancer"
+issue: https://github.com/apache/bookkeeper/846
+state: "WIP" 
+release: "x.y.z"
+---
+
+### Motivation
+
+There are typical two use cases of _Apache BookKeeper_, one is *Messaging/Streaming/Logging* style use cases, the other one is *Storage* style use cases.
+
+In Messaging/Streaming/Logging oriented use case (where old ledgers/segments are most likely will be deleted at some point), we don't actually need to rebalance the ledgers stored on bookies.
+
+However,
+In Storage oriented use cases (where data most likely will never be deleted), BookKeeper data might not always be placed uniformly across bookies. One common reason is addition of new bookies to an existing cluster. This proposal is proposing to provide a balancer mechanism (as an utility, also as part of AutoRecovery daemon), that analyzes ledger distributions and balances ledgers across bookies.
+
+It replicated ledgers to new bookies (based on resource-aware placement policies) until the cluster is deemed to be balanced, which means that disk utilization of every bookie (ratio of used space on the node to the capacity of the node) differs from the utilization of the cluster (ratio of used space on the cluster to total capacity of the cluster) by no more than a given threshold percentage.
+
+The balancer will replicate ledgers away from disk-filled bookies as first priority.
+
+### Public Interfaces
+
+There is not public API changes.
+
+Potentially we might need a new command in `BookieShell` to run balancer.
+
+### Proposed Changes
+
+[TBD]
+
+A couple of thoughts:
+
+- it should be moving ledgers from filled-up bookies only.
+- it should enforce resource-awareness into the ledger placement policy.
+- it should provide capabilities to throttle bandwidth usage.
+
+
+### Compatibility, Deprecation, and Migration Plan
+
+N/A
+
+### Test Plan
+
+[TBD]
+
+### Rejected Alternatives
+
+Manual balancer using `Recovery` tools.
+
+[TBD]
diff --git a/site/community/bookkeeper_proposals.md b/site/community/bookkeeper_proposals.md
index a918f9a9ab..8ccfef2db5 100644
--- a/site/community/bookkeeper_proposals.md
+++ b/site/community/bookkeeper_proposals.md
@@ -85,7 +85,7 @@ using Google Doc.
 
 This section lists all the _bookkeeper proposals_ made to BookKeeper.
 
-*Next Proposal Number: 23*
+*Next Proposal Number: 24*
 
 ### Inprogress
 
@@ -97,6 +97,7 @@ Proposal | State
 [BP-14 Relax durability](https://cwiki.apache.org/confluence/display/BOOKKEEPER/BP-14+Relax+durability) | Accepted
 [BP-16: Thin Client - Remove direct metadata storage access from clients](https://cwiki.apache.org/confluence/display/BOOKKEEPER/BP-16%3A+Thin+Client+-+Remove+direct+metadata+storage+access+from+clients) | Draft
 [BP-18: LedgerType, Flags and StorageHints](https://cwiki.apache.org/confluence/display/BOOKKEEPER/BP-18%3A+LedgerType%2C+Flags+and+StorageHints) | Accepted
+[BP-23: Ledger Rebalancer](../../bps/BP-23-ledger-rebalancer) | Draft
 
 
 ### Adopted


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services