You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/07/14 00:47:36 UTC

[GitHub] [hbase] huaxiangsun commented on a change in pull request #2003: HBASE-24633 Remove data locality and StoreFileCostFunction for replic…

huaxiangsun commented on a change in pull request #2003:
URL: https://github.com/apache/hbase/pull/2003#discussion_r454031142



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
##########
@@ -1462,8 +1473,14 @@ protected double getCostFromRl(BalancerRegionLoad rl) {
     }
 
     @Override
-    protected double getCostFromRl(BalancerRegionLoad rl) {
-      return rl.getStorefileSizeMB();
+    protected double getCostFromRl(BalancerRegionLoad rl, boolean isPrimaryRegion) {
+      // Do not count replica region's file size, as replica regions serve very little
+      // read requests, this may be changed if there are enough data from production showing

Review comment:
       As I wrote in the comments, all these factors really impacts system performance.  From one of the production clusters' stats, < 0.01% of requests goes to replica regions, which means most of regions are cold at Region servers. That is the reason I want to remove this factors from balancer. Agreed with you that things could be different with others, make it configurable makes more sense. If it is ok with you, I want to drop this change from this patch and creates a separate issue to track it, probably with a test case as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org