You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2021/03/29 19:47:26 UTC

[GitHub] [hbase] saintstack commented on a change in pull request #3067: HBASE-25625 HBASE-25625 StochasticBalancer CostFunctions needs a bett…

saintstack commented on a change in pull request #3067:
URL: https://github.com/apache/hbase/pull/3067#discussion_r603565022



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
##########
@@ -522,6 +536,51 @@ public boolean serverHasTooFewRegions(int server) {
       return numRegions < minLoad;
     }
 
+    /**
+     * Return the min skew of distribution
+     */
+    public static double getMinSkew(double total, double numServers) {
+      double mean = total / numServers;
+      // It's possible that there aren't enough regions to go around
+      double min;
+      if (numServers > total) {
+        min = ((numServers - total) * mean + (1 - mean) * total) ;
+      } else {
+        // Some will have 1 more than everything else.
+        int numHigh = (int) (total - (Math.floor(mean) * numServers));
+        int numLow = (int) (numServers - numHigh);
+        min = numHigh * (Math.ceil(mean) - mean) + numLow * (mean - Math.floor(mean));
+      }
+      return min;
+    }
+
+    /**
+     * Return the max deviation of distribution
+     * Compute max as if all region servers had 0 and one had the sum of all costs.  This must be
+     * a zero sum cost for this to make sense.
+     */
+    public static double getMaxSkew(double total, double numServers) {
+      double mean = total / numServers;
+      return (total - mean) + (numServers - 1) * mean;
+    }
+
+    /**
+     * Scale the value between 0 and 1.
+     *
+     * @param min   Min value
+     * @param max   The Max value
+     * @param value The value to be scaled.
+     * @return The scaled value.
+     */
+    public static double scale(double min, double max, double value) {
+      if (max <= min || value <= min) {
+        return 0;
+      }
+      if ((max - min) == 0) return 0;

Review comment:
       Needs parens to align w/ our styling.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
##########
@@ -834,22 +908,20 @@ void regionMoved(int region, int oldServer, int newServer) {
       int tableIndex = regionIndexToTableIndex[region];
       if (oldServer >= 0) {
         numRegionsPerServerPerTable[oldServer][tableIndex]--;
+        // update regionSkewPerTable for the move from old server
+        regionSkewByTable +=
+            Math.abs(numRegionsPerServerPerTable[oldServer][tableIndex]
+              - meanRegionsPerTable[tableIndex])
+              - Math.abs(numRegionsPerServerPerTable[oldServer][tableIndex] + 1
+              - meanRegionsPerTable[tableIndex]);
       }
       numRegionsPerServerPerTable[newServer][tableIndex]++;
-
-      //check whether this caused maxRegionsPerTable in the new Server to be updated
-      if (numRegionsPerServerPerTable[newServer][tableIndex] > numMaxRegionsPerTable[tableIndex]) {
-        numMaxRegionsPerTable[tableIndex] = numRegionsPerServerPerTable[newServer][tableIndex];
-      } else if (oldServer >= 0 && (numRegionsPerServerPerTable[oldServer][tableIndex] + 1)
-          == numMaxRegionsPerTable[tableIndex]) {
-        //recompute maxRegionsPerTable since the previous value was coming from the old server
-        numMaxRegionsPerTable[tableIndex] = 0;
-        for (int[] aNumRegionsPerServerPerTable : numRegionsPerServerPerTable) {
-          if (aNumRegionsPerServerPerTable[tableIndex] > numMaxRegionsPerTable[tableIndex]) {
-            numMaxRegionsPerTable[tableIndex] = aNumRegionsPerServerPerTable[tableIndex];
-          }
-        }
-      }
+      // update regionSkewPerTable for the move to new server
+      regionSkewByTable +=
+        Math.abs(numRegionsPerServerPerTable[newServer][tableIndex]
+          - meanRegionsPerTable[tableIndex])
+          - Math.abs(numRegionsPerServerPerTable[newServer][tableIndex] - 1
+          - meanRegionsPerTable[tableIndex]);

Review comment:
       Is this duplicated code? Can it be made a function?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
##########
@@ -816,65 +821,30 @@ protected void regionMoved(int region, int oldServer, int newServer) {
      *
      * @param stats the costs
      * @return a scaled set of costs.
-     */
-    protected double costFromArray(double[] stats) {
+     */ protected double costFromArray(double[] stats) {
       double totalCost = 0;
       double total = getSum(stats);
 
       double count = stats.length;
-      double mean = total/count;
-
-      // Compute max as if all region servers had 0 and one had the sum of all costs.  This must be
-      // a zero sum cost for this to make sense.
-      double max = ((count - 1) * mean) + (total - mean);
-
-      // It's possible that there aren't enough regions to go around
-      double min;
-      if (count > total) {
-        min = ((count - total) * mean) + ((1 - mean) * total);
-      } else {
-        // Some will have 1 more than everything else.
-        int numHigh = (int) (total - (Math.floor(mean) * count));
-        int numLow = (int) (count - numHigh);
-
-        min = (numHigh * (Math.ceil(mean) - mean)) + (numLow * (mean - Math.floor(mean)));
+      double mean = total / count;
 
-      }
-      min = Math.max(0, min);
-      for (int i=0; i<stats.length; i++) {
+      for (int i = 0; i < stats.length; i++) {
         double n = stats[i];
-        double diff = Math.abs(mean - n);
+        double diff = Math.abs(mean - n) ;

Review comment:
       Space here is not intended...
   

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java
##########
@@ -522,6 +536,51 @@ public boolean serverHasTooFewRegions(int server) {
       return numRegions < minLoad;
     }
 
+    /**
+     * Return the min skew of distribution
+     */
+    public static double getMinSkew(double total, double numServers) {

Review comment:
       These methods are public because used outside this class? If not, keep them private to the class I'd say.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
##########
@@ -816,65 +821,30 @@ protected void regionMoved(int region, int oldServer, int newServer) {
      *
      * @param stats the costs
      * @return a scaled set of costs.
-     */
-    protected double costFromArray(double[] stats) {
+     */ protected double costFromArray(double[] stats) {

Review comment:
       Something happened here? Did you mean this change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org