You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2019/12/31 07:09:16 UTC

[GitHub] [hbase] mnpoonia opened a new pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

mnpoonia opened a new pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978
 
 
   …t regions

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-569888377
 
 
   @wchevreuil FYI, this is PR for master.
   branch-1 PR: https://github.com/apache/hbase/pull/931

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363075429
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  /**
+   * @param hri regioninfo
+   * @return size of region in MB and if region is not found than -1
+   */
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn =
+        masterServices.getAssignmentManager().getRegionStates().getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad =
+        masterServices.getServerManager().getLoad(sn).getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug("{} was not found in RegionsLoad", hri.getRegionNameAsString());
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+          .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
 
 Review comment:
   Yes. This is intellij vs eclipse formatting fighting on my machine. 😄 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259077
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
 
 Review comment:
   s/Avg/Average/

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362161876
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
+  }
+
+  @Override
+  public void planSkipped(RegionInfo hri, NormalizationPlan.PlanType type) {
+    skippedCount[type.ordinal()]++;
+  }
+
+  @Override
+  public long getSkippedCount(NormalizationPlan.PlanType type) {
+    return skippedCount[type.ordinal()];
+  }
+
+  @Override
+  public List<NormalizationPlan> computePlanForTable(TableName table) throws HBaseIOException {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    if (!shouldNormalize(table)) {
+      return null;
+    }
+    // atleast one of the two regions should be older than MIN_REGION_DURATION days
+    List<NormalizationPlan> normalizationPlans = getMergeNormalizationPlan(table);
+    for (NormalizationPlan plan : normalizationPlans) {
+      if (plan instanceof MergeNormalizationPlan) {
+        RegionInfo hri = ((MergeNormalizationPlan) plan).getFirstRegion();
+        RegionInfo hri2 = ((MergeNormalizationPlan) plan).getSecondRegion();
+        if (isOldEnoughToMerge(hri) || isOldEnoughToMerge(hri2)) {
+          plans.add(plan);
+        }
+      }
+    }
+    if (plans.isEmpty()) {
+      LOG.debug("No normalization needed, regions look good for table: {}", table);
+      return null;
+    }
+    return plans;
+  }
+
+  private boolean isOldEnoughToMerge(RegionInfo hri) {
+    Timestamp currentTime = new Timestamp(System.currentTimeMillis());
+    Timestamp hriTime = new Timestamp(hri.getRegionId());
+    boolean isOld =
+      new Timestamp(hriTime.getTime() + TimeUnit.DAYS.toMillis(getMinimumDurationBeforeMerge()))
+        .before(currentTime);
+    return isOld;
+  }
+
+  private boolean shouldNormalize(TableName table) {
+    boolean normalize = false;
+    if (table == null || table.isSystemTable()) {
+      LOG.debug("Normalization of system table {} isn't allowed", table);
 
 Review comment:
   table here would not be null here(only caller: HMaster), and we can remove null check from this if condition. Even if null was possible, log message would not be appropriate: `Normalization of system table null isn't allowed`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306568
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
 
 Review comment:
   for each @param, we need to write brief description, else QA build checkstyle would fail

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363658766
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
 
 Review comment:
   I think it is not possible unless we move the initialization of these two in constructor.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306164
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -132,18 +107,6 @@ public int compare(NormalizationPlan plan1, NormalizationPlan plan2) {
       return null;
     }
     boolean splitEnabled = true, mergeEnabled = true;
-    try {
-      splitEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether split is enabled", e);
-    }
-    try {
-      mergeEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether merge is enabled", e);
-    }
 
 Review comment:
   We have removed above block but may be missed using them.
   ```
       boolean splitEnabled = isSplitEnabled();
       boolean mergeEnabled = isMergeEnabled();
   ```
   Without this, by default splitEnabled and mergeEnabled are always true

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363660079
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -18,73 +18,45 @@
  */
 package org.apache.hadoop.hbase.master.normalizer;
 
-import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.Comparator;
 import java.util.List;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.HBaseIOException;
-import org.apache.hadoop.hbase.RegionMetrics;
-import org.apache.hadoop.hbase.ServerName;
-import org.apache.hadoop.hbase.Size;
 import org.apache.hadoop.hbase.TableName;
-import org.apache.hadoop.hbase.client.MasterSwitchType;
 import org.apache.hadoop.hbase.client.RegionInfo;
-import org.apache.hadoop.hbase.client.TableDescriptor;
-import org.apache.hadoop.hbase.master.MasterRpcServices;
-import org.apache.hadoop.hbase.master.MasterServices;
 import org.apache.hadoop.hbase.master.normalizer.NormalizationPlan.PlanType;
 import org.apache.yetus.audience.InterfaceAudience;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
-
 /**
- * Simple implementation of region normalizer.
- *
- * Logic in use:
- *
- *  <ol>
- *  <li> Get all regions of a given table
- *  <li> Get avg size S of each region (by total size of store files reported in RegionMetrics)
- *  <li> Seek every single region one by one. If a region R0 is bigger than S * 2, it is
- *  kindly requested to split. Thereon evaluate the next region R1
- *  <li> Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge.
- *  Thereon evaluate the next region R2
- *  <li> Otherwise, R1 is evaluated
+ * Simple implementation of region normalizer. Logic in use:
+ * <ol>
+ * <li>Get all regions of a given table
+ * <li>Get avg size S of each region (by total size of store files reported in RegionMetrics)
+ * <li>Seek every single region one by one. If a region R0 is bigger than S * 2, it is kindly
+ * requested to split. Thereon evaluate the next region R1
+ * <li>Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge. Thereon
+ * evaluate the next region R2
+ * <li>Otherwise, R1 is evaluated
  * </ol>
  * <p>
- * Region sizes are coarse and approximate on the order of megabytes. Additionally,
- * "empty" regions (less than 1MB, with the previous note) are not merged away. This
- * is by design to prevent normalization from undoing the pre-splitting of a table.
+ * Region sizes are coarse and approximate on the order of megabytes. Additionally, "empty" regions
+ * (less than 1MB, with the previous note) are not merged away. This is by design to prevent
+ * normalization from undoing the pre-splitting of a table.
 
 Review comment:
   I think i explained individual normalizer but didn't highlight how they are different. Let me add that. Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r364901954
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
 
 Review comment:
   In initializeZKBasedSystemTrackers(), these setters are used after object is created using Factory pattern with `RegionNormalizerFactory.getRegionNormalizer(conf)`. Hence, IMO these setters should be fine instead of constructors.
   However, for simplicity, single method might be sufficient to set `masterServices` and `masterRpcServices`. Again, it's a choice, setting individual instance var is what setters do anyways.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362161876
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
+  }
+
+  @Override
+  public void planSkipped(RegionInfo hri, NormalizationPlan.PlanType type) {
+    skippedCount[type.ordinal()]++;
+  }
+
+  @Override
+  public long getSkippedCount(NormalizationPlan.PlanType type) {
+    return skippedCount[type.ordinal()];
+  }
+
+  @Override
+  public List<NormalizationPlan> computePlanForTable(TableName table) throws HBaseIOException {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    if (!shouldNormalize(table)) {
+      return null;
+    }
+    // atleast one of the two regions should be older than MIN_REGION_DURATION days
+    List<NormalizationPlan> normalizationPlans = getMergeNormalizationPlan(table);
+    for (NormalizationPlan plan : normalizationPlans) {
+      if (plan instanceof MergeNormalizationPlan) {
+        RegionInfo hri = ((MergeNormalizationPlan) plan).getFirstRegion();
+        RegionInfo hri2 = ((MergeNormalizationPlan) plan).getSecondRegion();
+        if (isOldEnoughToMerge(hri) || isOldEnoughToMerge(hri2)) {
+          plans.add(plan);
+        }
+      }
+    }
+    if (plans.isEmpty()) {
+      LOG.debug("No normalization needed, regions look good for table: {}", table);
+      return null;
+    }
+    return plans;
+  }
+
+  private boolean isOldEnoughToMerge(RegionInfo hri) {
+    Timestamp currentTime = new Timestamp(System.currentTimeMillis());
+    Timestamp hriTime = new Timestamp(hri.getRegionId());
+    boolean isOld =
+      new Timestamp(hriTime.getTime() + TimeUnit.DAYS.toMillis(getMinimumDurationBeforeMerge()))
+        .before(currentTime);
+    return isOld;
+  }
+
+  private boolean shouldNormalize(TableName table) {
+    boolean normalize = false;
+    if (table == null || table.isSystemTable()) {
+      LOG.debug("Normalization of system table {} isn't allowed", table);
 
 Review comment:
   table here would not be null here(only caller: HMaster), and we can remove null check from this if condition. Even if null was possible, log message would not be appropriate: `Normalization of system table null isn't allowed`
   `table == null` can be removed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259797
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
 
 Review comment:
   Any speculation on how this algorithm will perform out in the wild? What do we expect its behavior to be generally?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363026058
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
 
 Review comment:
   @mnpoonia you might want to push this in your next commit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362260305
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
 
 Review comment:
   Does this config need the name of the implementation in it? hbase.normalizer.merge.min.region.count ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363604582
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
 
 Review comment:
   Do we have to have these two methods?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362257513
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
+  /** Config for min age of region before being considerded for merge in mormalizer */
+  public static final int DEFAULT_MIN_DAYS_BEFORE_MERGE = 3;
+
+  public static final String HBASE_MASTER_DAYS_BEFORE_MERGE =
+      "hbase.master.normalize.daysBeforeMerge";
+
 
 Review comment:
   These defines are used in one place only. They don't need to be in this global file of defines. Only stuff that is used in multple places should go here. FYI.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363026140
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -131,108 +102,38 @@ public int compare(NormalizationPlan plan1, NormalizationPlan plan2) {
       LOG.debug("Normalization of system table " + table + " isn't allowed");
       return null;
     }
-    boolean splitEnabled = true, mergeEnabled = true;
-    try {
-      splitEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether split is enabled", e);
-    }
-    try {
-      mergeEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether merge is enabled", e);
-    }
+    boolean splitEnabled = isSplitEnabled();
+    boolean mergeEnabled = isMergeEnabled();
     if (!mergeEnabled && !splitEnabled) {
       LOG.debug("Both split and merge are disabled for table: " + table);
       return null;
     }
     List<NormalizationPlan> plans = new ArrayList<>();
-    List<RegionInfo> tableRegions = masterServices.getAssignmentManager().getRegionStates().
-      getRegionsOfTable(table);
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
 
-    //TODO: should we make min number of regions a config param?
     if (tableRegions == null || tableRegions.size() < minRegionCount) {
       int nrRegions = tableRegions == null ? 0 : tableRegions.size();
       LOG.debug("Table " + table + " has " + nrRegions + " regions, required min number"
-        + " of regions for normalizer to run is " + minRegionCount + ", not running normalizer");
+          + " of regions for normalizer to run is " + minRegionCount + ", not running normalizer");
       return null;
     }
 
-    LOG.debug("Computing normalization plan for table: " + table +
-      ", number of regions: " + tableRegions.size());
-
-    long totalSizeMb = 0;
-    int acutalRegionCnt = 0;
+    LOG.debug("Computing normalization plan for table: " + table + ", number of regions: "
+        + tableRegions.size());
 
 Review comment:
   {} can be used

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362385208
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
+  /** Config for min age of region before being considerded for merge in mormalizer */
+  public static final int DEFAULT_MIN_DAYS_BEFORE_MERGE = 3;
+
+  public static final String HBASE_MASTER_DAYS_BEFORE_MERGE =
+      "hbase.master.normalize.daysBeforeMerge";
+
 
 Review comment:
   I think we can move both these constants to MergeNormalizer. If we move to hbase-default, we don't need this constant `DEFAULT_MIN_DAYS_BEFORE_MERGE` because `hbase.master.normalize.daysBeforeMerge` will always have value, either from hbase-default or hbase-site

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362258825
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
 
 Review comment:
   Needs javadoc to say what the return is? That its size in megabytes?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r364879260
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -18,73 +18,45 @@
  */
 package org.apache.hadoop.hbase.master.normalizer;
 
-import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.Comparator;
 import java.util.List;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.HBaseIOException;
-import org.apache.hadoop.hbase.RegionMetrics;
-import org.apache.hadoop.hbase.ServerName;
-import org.apache.hadoop.hbase.Size;
 import org.apache.hadoop.hbase.TableName;
-import org.apache.hadoop.hbase.client.MasterSwitchType;
 import org.apache.hadoop.hbase.client.RegionInfo;
-import org.apache.hadoop.hbase.client.TableDescriptor;
-import org.apache.hadoop.hbase.master.MasterRpcServices;
-import org.apache.hadoop.hbase.master.MasterServices;
 import org.apache.hadoop.hbase.master.normalizer.NormalizationPlan.PlanType;
 import org.apache.yetus.audience.InterfaceAudience;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
-
 /**
- * Simple implementation of region normalizer.
- *
- * Logic in use:
- *
- *  <ol>
- *  <li> Get all regions of a given table
- *  <li> Get avg size S of each region (by total size of store files reported in RegionMetrics)
- *  <li> Seek every single region one by one. If a region R0 is bigger than S * 2, it is
- *  kindly requested to split. Thereon evaluate the next region R1
- *  <li> Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge.
- *  Thereon evaluate the next region R2
- *  <li> Otherwise, R1 is evaluated
+ * Simple implementation of region normalizer. Logic in use:
+ * <ol>
+ * <li>Get all regions of a given table
+ * <li>Get avg size S of each region (by total size of store files reported in RegionMetrics)
+ * <li>Seek every single region one by one. If a region R0 is bigger than S * 2, it is kindly
+ * requested to split. Thereon evaluate the next region R1
+ * <li>Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge. Thereon
+ * evaluate the next region R2
+ * <li>Otherwise, R1 is evaluated
  * </ol>
  * <p>
- * Region sizes are coarse and approximate on the order of megabytes. Additionally,
- * "empty" regions (less than 1MB, with the previous note) are not merged away. This
- * is by design to prevent normalization from undoing the pre-splitting of a table.
+ * Region sizes are coarse and approximate on the order of megabytes. Additionally, "empty" regions
+ * (less than 1MB, with the previous note) are not merged away. This is by design to prevent
+ * normalization from undoing the pre-splitting of a table.
 
 Review comment:
   Added the comments explaining how it is different

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani edited a comment on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani edited a comment on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-569888377
 
 
   @wchevreuil FYI, this is PR for master.
   branch-1 PR: https://github.com/apache/hbase/pull/931 (with initial discussions)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306472
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
 
 Review comment:
   Also, value `3` here can be replaced with `MIN_REGION_COUNT`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363604328
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
 
 Review comment:
   Ditto

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363195302
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
 Review comment:
   Yes. That's right.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-571032343
 
 
   @saintstack sir, any suggestions from your side?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362161876
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
+  }
+
+  @Override
+  public void planSkipped(RegionInfo hri, NormalizationPlan.PlanType type) {
+    skippedCount[type.ordinal()]++;
+  }
+
+  @Override
+  public long getSkippedCount(NormalizationPlan.PlanType type) {
+    return skippedCount[type.ordinal()];
+  }
+
+  @Override
+  public List<NormalizationPlan> computePlanForTable(TableName table) throws HBaseIOException {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    if (!shouldNormalize(table)) {
+      return null;
+    }
+    // atleast one of the two regions should be older than MIN_REGION_DURATION days
+    List<NormalizationPlan> normalizationPlans = getMergeNormalizationPlan(table);
+    for (NormalizationPlan plan : normalizationPlans) {
+      if (plan instanceof MergeNormalizationPlan) {
+        RegionInfo hri = ((MergeNormalizationPlan) plan).getFirstRegion();
+        RegionInfo hri2 = ((MergeNormalizationPlan) plan).getSecondRegion();
+        if (isOldEnoughToMerge(hri) || isOldEnoughToMerge(hri2)) {
+          plans.add(plan);
+        }
+      }
+    }
+    if (plans.isEmpty()) {
+      LOG.debug("No normalization needed, regions look good for table: {}", table);
+      return null;
+    }
+    return plans;
+  }
+
+  private boolean isOldEnoughToMerge(RegionInfo hri) {
+    Timestamp currentTime = new Timestamp(System.currentTimeMillis());
+    Timestamp hriTime = new Timestamp(hri.getRegionId());
+    boolean isOld =
+      new Timestamp(hriTime.getTime() + TimeUnit.DAYS.toMillis(getMinimumDurationBeforeMerge()))
+        .before(currentTime);
+    return isOld;
+  }
+
+  private boolean shouldNormalize(TableName table) {
+    boolean normalize = false;
+    if (table == null || table.isSystemTable()) {
+      LOG.debug("Normalization of system table {} isn't allowed", table);
 
 Review comment:
   table here would not be null(HMaster always calls it with non-null value), and we can remove null check from this if condition. Even if null was possible, log message would not be appropriate: `Normalization of system table null isn't allowed`
   
   `table == null` can be removed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259283
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
 
 Review comment:
   Put all into one line instead of two.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362380543
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
 
 Review comment:
   Yes. learned it the hard way. 😄 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362609834
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
+  /** Config for min age of region before being considerded for merge in mormalizer */
+  public static final int DEFAULT_MIN_DAYS_BEFORE_MERGE = 3;
+
+  public static final String HBASE_MASTER_DAYS_BEFORE_MERGE =
+      "hbase.master.normalize.daysBeforeMerge";
+
 
 Review comment:
   If particular to an implemenation, yeah, move to the implementation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362380577
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
 
 Review comment:
   Coming up with new patch.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362380234
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -241,16 +147,4 @@ public int compare(NormalizationPlan plan1, NormalizationPlan plan2) {
     Collections.sort(plans, planComparator);
     return plans;
   }
-
-  private long getRegionSize(RegionInfo hri) {
-    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
-      getRegionServerOfRegion(hri);
-    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
-      getRegionMetrics().get(hri.getRegionName());
-    if (regionLoad == null) {
-      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
-      return -1;
-    }
-    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
-  }
 
 Review comment:
   Thanks for the review sir. Currently working on test with valid test cases of merge

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362161452
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
 
 Review comment:
   we can get rid of this if condition with `while( candidateIdx < tableRegions.size()-1 )`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570199247
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 43s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 37s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   7m 19s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 59s |  master passed  |
   | +0 :ok: |  refguide  |   6m 25s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  4s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  master passed  |
   | +0 :ok: |  spotbugs  |   4m 57s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 43s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   2m 29s |  root in the patch failed.  |
   | -1 :x: |  compile  |   0m 16s |  hbase-server in the patch failed.  |
   | -1 :x: |  javac  |   0m 16s |  hbase-server in the patch failed.  |
   | -1 :x: |  checkstyle  |   1m 24s |  hbase-server: The patch generated 7 new + 0 unchanged - 0 fixed = 7 total (was 0)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 21s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | -1 :x: |  shadedjars  |   3m 32s |  patch has 10 errors when building our shaded downstream artifacts.  |
   | -1 :x: |  hadoopcheck  |   1m 43s |  The patch causes 10 errors with Hadoop v2.8.5.  |
   | -1 :x: |  hadoopcheck  |   3m 30s |  The patch causes 10 errors with Hadoop v2.9.2.  |
   | -1 :x: |  hadoopcheck  |   5m 19s |  The patch causes 10 errors with Hadoop v3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  the patch passed  |
   | -1 :x: |  findbugs  |   0m 15s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  5s |  hbase-common in the patch passed.  |
   | -1 :x: |  unit  |   0m 17s |  hbase-server in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 19s |  The patch does not generate ASF License warnings.  |
   |  |   |  59m  4s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml |
   | uname | Linux e0127176a5d8 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / 2a4bd0574b |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/branch-site/book.html |
   | mvninstall | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-mvninstall-root.txt |
   | compile | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-compile-hbase-server.txt |
   | javac | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-compile-hbase-server.txt |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/diff-checkstyle-hbase-server.txt |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-site/book.html |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-shadedjars.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-javac-2.8.5.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-javac-2.9.2.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-javac-3.1.2.txt |
   | findbugs | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-findbugs-hbase-server.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/artifact/out/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/testReport/ |
   | Max. process+thread count | 254 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/2/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362609462
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
+   * @return list of split normalization plans
+   */
+  protected List<NormalizationPlan> getSplitNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      // if the region is > 2 times larger than average, we split it, split
+      // is more high priority normalization action than merge.
+      if (regionSize > 2 * avgRegionSize) {
+        LOG.info("Table {}, large region {} has size {}, more than twice avg size, splitting",
+          table, hri.getRegionNameAsString(), regionSize);
+        plans.add(new SplitNormalizationPlan(hri, null));
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+}
 
 Review comment:
   Smile. So you and I disagree with @wchevreuil . Smile! Ok then, I'm glad you followed Wellington's suggestion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362260843
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
 Review comment:
   If I want the new class as my normalizer, I'd set this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r364928690
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,143 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Considering the split policy takes care of splitting region we also want a way to merge when
+ * regions are too small. It is little different than what
+ * {@link org.apache.hadoop.hbase.master.normalizer.SimpleRegionNormalizer} does. Instead of doing
 
 Review comment:
   Good

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363604303
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
 
 Review comment:
   This can't be set in constructor?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r364928156
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
 
 Review comment:
   ok @mnpoonia .  Good enough for now I'd say. Not a problem of your making.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570785307
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  5s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 51s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 56s |  master passed  |
   | +0 :ok: |  refguide  |   6m 25s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  5s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  master passed  |
   | +0 :ok: |  spotbugs  |   5m  2s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 48s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   5m 30s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 21s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 26s |  hbase-server: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 19s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  2s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |  17m 32s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   5m 58s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  9s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 156m 50s |  hbase-server in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  The patch does not generate ASF License warnings.  |
   |  |   | 241m 51s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit refguide xml spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 1a6e3fc807d0 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / ab9766599d |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/artifact/out/branch-site/book.html |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/artifact/out/diff-checkstyle-hbase-server.txt |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/artifact/out/patch-site/book.html |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/testReport/ |
   | Max. process+thread count | 5356 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/5/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363026100
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  /**
+   * @param hri regioninfo
+   * @return size of region in MB and if region is not found than -1
+   */
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn =
+        masterServices.getAssignmentManager().getRegionStates().getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad =
+        masterServices.getServerManager().getLoad(sn).getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug("{} was not found in RegionsLoad", hri.getRegionNameAsString());
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+          .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
 
 Review comment:
   we have already imported this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362258437
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
 
 Review comment:
   Would be sweet if we didn't have to have an Interface and then an abstract class and then implementations. Why the Interface if all is internal and we are doing an abstract class anyways?
   
   Should this class be called AbstractRegionNormalizer insead of BaseNormalizer?a

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362260797
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
 
 Review comment:
   How does this class differ from SimpleRegionNormalizer's operation? Should say so here in class comment?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570905387
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  5s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 47s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 59s |  master passed  |
   | +0 :ok: |  refguide  |   6m 23s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  1s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  master passed  |
   | +0 :ok: |  spotbugs  |   4m 42s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 29s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   5m 25s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 20s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 54s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 21s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  0s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |  17m 18s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   5m 51s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  6s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 159m 27s |  hbase-server in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  The patch does not generate ASF License warnings.  |
   |  |   | 243m 16s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/6/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit refguide xml spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 5986404dc310 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / ab9766599d |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/6/artifact/out/branch-site/book.html |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/6/artifact/out/patch-site/book.html |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/6/testReport/ |
   | Max. process+thread count | 5418 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/6/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259025
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
 
 Review comment:
   Class needs audience annotation? Private?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306280
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
+  }
+
+  @Override
+  public void planSkipped(RegionInfo hri, NormalizationPlan.PlanType type) {
+    skippedCount[type.ordinal()]++;
+  }
+
+  @Override
+  public long getSkippedCount(NormalizationPlan.PlanType type) {
+    return skippedCount[type.ordinal()];
+  }
+
+  @Override
+  public List<NormalizationPlan> computePlanForTable(TableName table) throws HBaseIOException {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    if (!shouldNormalize(table)) {
+      return null;
+    }
+    // atleast one of the two regions should be older than MIN_REGION_DURATION days
+    List<NormalizationPlan> normalizationPlans = getMergeNormalizationPlan(table);
+    for (NormalizationPlan plan : normalizationPlans) {
+      if (plan instanceof MergeNormalizationPlan) {
+        RegionInfo hri = ((MergeNormalizationPlan) plan).getFirstRegion();
+        RegionInfo hri2 = ((MergeNormalizationPlan) plan).getSecondRegion();
+        if (isOldEnoughToMerge(hri) || isOldEnoughToMerge(hri2)) {
+          plans.add(plan);
+        }
+      }
+    }
+    if (plans.isEmpty()) {
+      LOG.debug("No normalization needed, regions look good for table: {}", table);
+      return null;
+    }
+    return plans;
+  }
+
+  private boolean isOldEnoughToMerge(RegionInfo hri) {
+    Timestamp currentTime = new Timestamp(System.currentTimeMillis());
+    Timestamp hriTime = new Timestamp(hri.getRegionId());
+    boolean isOld =
+      new Timestamp(hriTime.getTime() + TimeUnit.DAYS.toMillis(getMinimumDurationBeforeMerge()))
+        .before(currentTime);
+    return isOld;
+  }
+
+  private boolean shouldNormalize(TableName table) {
+    boolean normalize = false;
+    if (table == null || table.isSystemTable()) {
+      LOG.debug("Normalization of system table {} isn't allowed", table);
+    } else if (!isMergeEnabled()) {
+      LOG.debug("Merge disabled for table: {}", table);
+    } else {
+      List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+      if (tableRegions == null || tableRegions.size() < MIN_REGION_COUNT) {
+        int nrRegions = tableRegions == null ? 0 : tableRegions.size();
+        LOG.debug(
+          "Table {} has {} regions, required min number of regions for normalizer to run is {} , "
+            + "not running normalizer",
+          table, nrRegions, MIN_REGION_COUNT);
+      } else {
+        normalize = true;
+      }
+    }
+    return normalize;
+  }
+
+  private int getMinimumDurationBeforeMerge() {
+    int minDuration = masterServices.getConfiguration().getInt(
+      HConstants.HBASE_MASTER_DAYS_BEFORE_MERGE, HConstants.DEFAULT_MIN_DAYS_BEFORE_MERGE);
+    return minDuration;
 
 Review comment:
   we can directly return:
   ```
       return masterServices.getConfiguration().getInt(
         HConstants.HBASE_MASTER_DAYS_BEFORE_MERGE, HConstants.DEFAULT_MIN_DAYS_BEFORE_MERGE);
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-569885411
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 11s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | -0 :warning: |  test4tests  |   0m  0s |  The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 42s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 43s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 57s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m  7s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  master passed  |
   | +0 :ok: |  spotbugs  |   4m 42s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 28s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   2m 59s |  root in the patch failed.  |
   | +1 :green_heart: |  compile  |   1m 20s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 20s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 27s |  hbase-server: The patch generated 7 new + 0 unchanged - 0 fixed = 7 total (was 0)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | -1 :x: |  shadedjars  |   3m 58s |  patch has 11 errors when building our shaded downstream artifacts.  |
   | -1 :x: |  hadoopcheck  |   2m 17s |  The patch causes 11 errors with Hadoop v2.8.5.  |
   | -1 :x: |  hadoopcheck  |   4m 34s |  The patch causes 11 errors with Hadoop v2.9.2.  |
   | -1 :x: |  hadoopcheck  |   6m 54s |  The patch causes 11 errors with Hadoop v3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   5m 43s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  5s |  hbase-common in the patch passed.  |
   | -1 :x: |  unit  |   0m 59s |  hbase-server in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 20s |  The patch does not generate ASF License warnings.  |
   |  |   |  52m 12s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 19887a464cec 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / 06eff551c3 |
   | Default Java | 1.8.0_181 |
   | mvninstall | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-mvninstall-root.txt |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/diff-checkstyle-hbase-server.txt |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-shadedjars.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-javac-2.8.5.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-javac-2.9.2.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-javac-3.1.2.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/artifact/out/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/testReport/ |
   | Max. process+thread count | 254 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/1/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362257639
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -241,16 +147,4 @@ public int compare(NormalizationPlan plan1, NormalizationPlan plan2) {
     Collections.sort(plans, planComparator);
     return plans;
   }
-
-  private long getRegionSize(RegionInfo hri) {
-    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
-      getRegionServerOfRegion(hri);
-    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
-      getRegionMetrics().get(hri.getRegionName());
-    if (regionLoad == null) {
-      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
-      return -1;
-    }
-    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
-  }
 
 Review comment:
   Any chance of a test to prove the new normalizer is doing as advertised? Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362161526
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
 
 Review comment:
   Instead of regionId, may be good to print region name with `hri.getRegionNameAsString()`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363016055
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
+  }
+
+  @Override
+  public void planSkipped(RegionInfo hri, NormalizationPlan.PlanType type) {
+    skippedCount[type.ordinal()]++;
+  }
+
+  @Override
+  public long getSkippedCount(NormalizationPlan.PlanType type) {
+    return skippedCount[type.ordinal()];
+  }
+
+  @Override
+  public List<NormalizationPlan> computePlanForTable(TableName table) throws HBaseIOException {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    if (!shouldNormalize(table)) {
+      return null;
+    }
+    // atleast one of the two regions should be older than MIN_REGION_DURATION days
+    List<NormalizationPlan> normalizationPlans = getMergeNormalizationPlan(table);
+    for (NormalizationPlan plan : normalizationPlans) {
+      if (plan instanceof MergeNormalizationPlan) {
+        RegionInfo hri = ((MergeNormalizationPlan) plan).getFirstRegion();
+        RegionInfo hri2 = ((MergeNormalizationPlan) plan).getSecondRegion();
+        if (isOldEnoughToMerge(hri) || isOldEnoughToMerge(hri2)) {
+          plans.add(plan);
+        }
+      }
+    }
+    if (plans.isEmpty()) {
+      LOG.debug("No normalization needed, regions look good for table: {}", table);
+      return null;
+    }
+    return plans;
+  }
+
+  private boolean isOldEnoughToMerge(RegionInfo hri) {
+    Timestamp currentTime = new Timestamp(System.currentTimeMillis());
+    Timestamp hriTime = new Timestamp(hri.getRegionId());
+    boolean isOld =
+      new Timestamp(hriTime.getTime() + TimeUnit.DAYS.toMillis(getMinimumDurationBeforeMerge()))
+        .before(currentTime);
+    return isOld;
+  }
+
+  private boolean shouldNormalize(TableName table) {
+    boolean normalize = false;
+    if (table == null || table.isSystemTable()) {
+      LOG.debug("Normalization of system table {} isn't allowed", table);
 
 Review comment:
   Agreed. But we don't know why someone put it there in first place. SO better to keep it. We can remove it later.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570756563
 
 
   > How does this relate to #931 ? Different branches? Thanks.
   Yes . That is for branch-1. And they are bit different so created two PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-569878489
 
 
   @virajjasani - FYI

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362380501
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
+   * @return list of split normalization plans
+   */
+  protected List<NormalizationPlan> getSplitNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      // if the region is > 2 times larger than average, we split it, split
+      // is more high priority normalization action than merge.
+      if (regionSize > 2 * avgRegionSize) {
+        LOG.info("Table {}, large region {} has size {}, more than twice avg size, splitting",
+          table, hri.getRegionNameAsString(), regionSize);
+        plans.add(new SplitNormalizationPlan(hri, null));
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+}
 
 Review comment:
   My first thought was same while working on branch-1 patch. See https://github.com/apache/hbase/pull/931#issuecomment-566537250

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363656264
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
 
 Review comment:
   So we can surely move them to constructor. The reason i didn't do so is because they were present in interface already. I am not sure why they were kept as different methods in initial design.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570389008
 
 
   How does this relate to #931  ? Different branches? Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306568
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
 
 Review comment:
   for each `@param`, we need to write brief description, else QA build checkstyle would fail

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570769901
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   3m 58s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   6m  6s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   2m  8s |  master passed  |
   | +0 :ok: |  refguide  |   7m  2s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m 16s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  master passed  |
   | +0 :ok: |  spotbugs  |   5m 25s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   6m 24s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   5m 42s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 24s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 30s |  hbase-server: The patch generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 26s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m 34s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |  18m 11s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   6m 13s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  7s |  hbase-common in the patch passed.  |
   | -1 :x: |  unit  | 161m 29s |  hbase-server in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  The patch does not generate ASF License warnings.  |
   |  |   | 253m 21s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hbase.master.assignment.TestRegionMoveAndAbandon |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit refguide xml spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux e50455bf97f3 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / ab9766599d |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/artifact/out/branch-site/book.html |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/artifact/out/diff-checkstyle-hbase-server.txt |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/artifact/out/patch-site/book.html |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/artifact/out/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/testReport/ |
   | Max. process+thread count | 4893 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/4/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362306401
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
 
 Review comment:
   `MIN_REGION_COUNT` is used in core code but not `minRegionCount`. We need to use `minRegionCount` instead.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362381889
 
 

 ##########
 File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
 ##########
 @@ -158,6 +158,12 @@
   public static final String HBASE_MASTER_NORMALIZER_CLASS =
     "hbase.master.normalizer.class";
 
+  /** Config for min age of region before being considerded for merge in mormalizer */
+  public static final int DEFAULT_MIN_DAYS_BEFORE_MERGE = 3;
+
+  public static final String HBASE_MASTER_DAYS_BEFORE_MERGE =
+      "hbase.master.normalize.daysBeforeMerge";
+
 
 Review comment:
   Agreed. So maybe i can move it to hbase-defaults.xml and read it from there.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-570245085
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  7s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 48s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 56s |  master passed  |
   | +0 :ok: |  refguide  |   6m 20s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  3s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  master passed  |
   | +0 :ok: |  spotbugs  |   4m 59s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   5m 45s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   2m 58s |  root in the patch failed.  |
   | +1 :green_heart: |  compile  |   1m 18s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 18s |  the patch passed  |
   | -1 :x: |  checkstyle  |   1m 28s |  hbase-server: The patch generated 5 new + 0 unchanged - 0 fixed = 5 total (was 0)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 18s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | -1 :x: |  shadedjars  |   3m 58s |  patch has 11 errors when building our shaded downstream artifacts.  |
   | -1 :x: |  hadoopcheck  |   2m 14s |  The patch causes 11 errors with Hadoop v2.8.5.  |
   | -1 :x: |  hadoopcheck  |   4m 32s |  The patch causes 11 errors with Hadoop v2.9.2.  |
   | -1 :x: |  hadoopcheck  |   6m 50s |  The patch causes 11 errors with Hadoop v3.1.2.  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   6m  1s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m  6s |  hbase-common in the patch passed.  |
   | -1 :x: |  unit  |   1m  0s |  hbase-server in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 20s |  The patch does not generate ASF License warnings.  |
   |  |   |  65m 13s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml |
   | uname | Linux 271fdd5013d7 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / 2a4bd0574b |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/branch-site/book.html |
   | mvninstall | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-mvninstall-root.txt |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/diff-checkstyle-hbase-server.txt |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-site/book.html |
   | shadedjars | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-shadedjars.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-javac-2.8.5.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-javac-2.9.2.txt |
   | hadoopcheck | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-javac-3.1.2.txt |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/artifact/out/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/testReport/ |
   | Max. process+thread count | 253 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/3/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-572677839
 
 
   The failure looks unrelated. It is for some other test. Infact in last push i only changed comments.
   @virajjasani - FYI

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362260193
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
+      tableRegions.size());
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      if (candidateIdx == tableRegions.size() - 1) {
+        break;
+      }
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      RegionInfo hri2 = tableRegions.get(candidateIdx + 1);
+      long regionSize2 = getRegionSize(hri2);
+      if (regionSize >= 0 && regionSize2 >= 0 && regionSize + regionSize2 < avgRegionSize) {
+        // atleast one of the two regions should be older than MIN_REGION_DURATION days
+        plans.add(new MergeNormalizationPlan(hri, hri2));
+        candidateIdx++;
+      } else {
+        LOG.debug("Skipping region {} of table {} with size {}", hri.getRegionId(), table,
+          regionSize);
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+
+  /**
+   * Computes the split plans that should be executed for this table to converge
+   * average region size towards target average or target region count
+   *
+   * @param table
+   * @return list of split normalization plans
+   */
+  protected List<NormalizationPlan> getSplitNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+
+    int candidateIdx = 0;
+    while (candidateIdx < tableRegions.size()) {
+      RegionInfo hri = tableRegions.get(candidateIdx);
+      long regionSize = getRegionSize(hri);
+      // if the region is > 2 times larger than average, we split it, split
+      // is more high priority normalization action than merge.
+      if (regionSize > 2 * avgRegionSize) {
+        LOG.info("Table {}, large region {} has size {}, more than twice avg size, splitting",
+          table, hri.getRegionNameAsString(), regionSize);
+        plans.add(new SplitNormalizationPlan(hri, null));
+      }
+      candidateIdx++;
+    }
+    return plans;
+  }
+}
 
 Review comment:
   Could this class be a utility class that implementations make use of rather than an abstract each must implement?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362258732
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
 
 Review comment:
   Please use parameterized logging as in LOG.debug("{} not found in RegionsLoad", hri.getRegionNameAsString()); (See elsewhere in code base for examples).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259404
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
+    }
+
+    double avgRegionSize;
+    if (targetRegionSize > 0) {
+      avgRegionSize = targetRegionSize;
+    } else if (targetRegionCount > 0) {
+      avgRegionSize = totalSizeMb / (double) targetRegionCount;
+    } else {
+      avgRegionSize = acutalRegionCnt == 0 ? 0 : totalSizeMb / (double) acutalRegionCnt;
+    }
+
+    LOG.debug("Table " + table + ", total aggregated regions size: " + totalSizeMb);
+    LOG.debug("Table " + table + ", average region size: " + avgRegionSize);
+    return avgRegionSize;
+  }
+
+  /**
+   * Computes the merge plans that should be executed for this table to converge
+   * average region towards target average or target region count
+   *
+   * @param table
+   * @return list of merge normalization plans
+   */
+  protected List<NormalizationPlan> getMergeNormalizationPlan(TableName table) {
+    List<NormalizationPlan> plans = new ArrayList<>();
+    List<RegionInfo> tableRegions =
+        masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table);
+    double avgRegionSize = getAvgRegionSize(tableRegions);
+    LOG.debug("Table {}, average region size: {}", table, avgRegionSize);
+    LOG.debug("Computing normalization plan for table: {}, number of regions: {}", table,
 
 Review comment:
   One line instead of two.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362260372
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
+ * </p>
+ */
+
+@InterfaceAudience.Private
+public class MergeNormalizer extends BaseNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(MergeNormalizer.class);
+  private static final int MIN_REGION_COUNT = 3;
+
+  private int minRegionCount;
+  private static long[] skippedCount = new long[NormalizationPlan.PlanType.values().length];
+
+  public MergeNormalizer() {
+    minRegionCount = HBaseConfiguration.create().getInt("hbase.normalizer.min.region.count", 3);
 
 Review comment:
   Ouch on the creation of an HBaseConfiguration just to read one config. Can you not pass in a Configuration instance?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-569881056
 
 
   We can include TestMergeNormalizer

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362305195
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/BaseNormalizer.java
 ##########
 @@ -0,0 +1,214 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class BaseNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(BaseNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
+
+  /**
+   * Set the master service.
+   * @param masterServices inject instance of MasterServices
+   */
+  @Override
+  public void setMasterServices(MasterServices masterServices) {
+    this.masterServices = masterServices;
+  }
+
+  @Override
+  public void setMasterRpcServices(MasterRpcServices masterRpcServices) {
+    this.masterRpcServices = masterRpcServices;
+  }
+
+  protected long getRegionSize(RegionInfo hri) {
+    ServerName sn = masterServices.getAssignmentManager().getRegionStates().
+      getRegionServerOfRegion(hri);
+    RegionMetrics regionLoad = masterServices.getServerManager().getLoad(sn).
+      getRegionMetrics().get(hri.getRegionName());
+    if (regionLoad == null) {
+      LOG.debug(hri.getRegionNameAsString() + " was not found in RegionsLoad");
+      return -1;
+    }
+    return (long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE);
+  }
+
+  protected boolean isMergeEnabled() {
+    boolean mergeEnabled = true;
+    try {
+      mergeEnabled = masterRpcServices
+        .isSplitOrMergeEnabled(null,
+          RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE))
+        .getEnabled();
+    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
+      LOG.warn("Unable to determine whether merge is enabled", e);
+    }
+    return mergeEnabled;
+  }
+
+  protected boolean isSplitEnabled() {
+    boolean splitEnabled = true;
+    try {
+      splitEnabled = masterRpcServices
+          .isSplitOrMergeEnabled(null,
+            RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT))
+          .getEnabled();
+    } catch (ServiceException se) {
+      LOG.warn("Unable to determine whether split is enabled", se);
+    }
+    return splitEnabled;
+  }
+
+  /**
+   *
+   * @param tableRegions
+   * @return average region size depending on
+   * {org.apache.hadoop.hbase.client.TableDescriptor#getNormalizerTargetRegionCount()}
+   *
+   * Also make sure we are sending regions of the same table
+   */
+  protected double getAvgRegionSize(List<RegionInfo> tableRegions) {
+    long totalSizeMb = 0;
+    int acutalRegionCnt = 0;
+    for (RegionInfo hri : tableRegions) {
+      long regionSize = getRegionSize(hri);
+      // don't consider regions that are in bytes for averaging the size.
+      if (regionSize > 0) {
+        acutalRegionCnt++;
+        totalSizeMb += regionSize;
+      }
+    }
+    TableName table = tableRegions.get(0).getTable();
+    int targetRegionCount = -1;
+    long targetRegionSize = -1;
+    try {
+      TableDescriptor tableDescriptor = masterServices.getTableDescriptors().get(table);
+      if(tableDescriptor != null) {
+        targetRegionCount =
+          tableDescriptor.getNormalizerTargetRegionCount();
+        targetRegionSize =
+          tableDescriptor.getNormalizerTargetRegionSize();
+        LOG.debug("Table {}:  target region count is {}, target region size is {}", table,
+          targetRegionCount, targetRegionSize);
+      }
+    } catch (IOException e) {
+      LOG.warn(
+        "cannot get the target number and target size of table {}, they will be default value -1.",
+        table);
 
 Review comment:
   Let's also print stacktrace by putting `e` at last arg. This was missing earlier too

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362259945
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/MergeNormalizer.java
 ##########
 @@ -0,0 +1,131 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import org.apache.hadoop.hbase.HBaseConfiguration;
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.hadoop.hbase.HConstants;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of MergeNormalizer Logic in use:
+ * <ol>
+ * <li>get all regions of a given table
+ * <li>get avg size S of each region (by total size of store files reported in RegionLoad)
+ * <li>two regions R1 and its neighbour R2 are merged, if R1 + R2 &lt; S, and all such regions are
+ * returned to be merged
+ * <li>Otherwise, no action is performed
+ * </ol>
+ * <p>
+ * Region sizes are coarse and approximate on the order of megabytes. Also, empty regions (less than
+ * 1MB) are also merged if the age of region is &gt MIN_DURATION_FOR_MERGE (default 2)
 
 Review comment:
   2 is 2 days? The default should have the unit of duration in it.
   
   This looks good.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363604821
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/AbstractRegionNormalizer.java
 ##########
 @@ -0,0 +1,213 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.master.normalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.hadoop.hbase.RegionMetrics;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.Size;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.MasterSwitchType;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.master.MasterRpcServices;
+import org.apache.hadoop.hbase.master.MasterServices;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hbase.thirdparty.com.google.protobuf.ServiceException;
+
+import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
+
+@InterfaceAudience.Private
+public abstract class AbstractRegionNormalizer implements RegionNormalizer {
+  private static final Logger LOG = LoggerFactory.getLogger(AbstractRegionNormalizer.class);
+  protected MasterServices masterServices;
+  protected MasterRpcServices masterRpcServices;
 
 Review comment:
   Can these be final?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack merged pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack merged pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on issue #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#issuecomment-571745043
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  9s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 1 new or modified test files.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 52s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   2m  1s |  master passed  |
   | +0 :ok: |  refguide  |   6m 26s |  branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m  6s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  master passed  |
   | +0 :ok: |  spotbugs  |   5m 39s |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   6m 27s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   5m 43s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   1m 55s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML file.  |
   | +0 :ok: |  refguide  |   6m 18s |  patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.  |
   | +1 :green_heart: |  shadedjars  |   5m 40s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  hadoopcheck  |  21m 35s |  Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  findbugs  |   6m 44s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   3m 29s |  hbase-common in the patch passed.  |
   | -1 :x: |  unit  | 169m 22s |  hbase-server in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  The patch does not generate ASF License warnings.  |
   |  |   | 264m  3s |   |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hbase.master.TestMasterShutdown |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.5 Server=19.03.5 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/978 |
   | Optional Tests | dupname asflicense javac javadoc unit refguide xml spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 7e8d6fb1b3d6 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-978/out/precommit/personality/provided.sh |
   | git revision | master / 5a7c9443e9 |
   | Default Java | 1.8.0_181 |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/artifact/out/branch-site/book.html |
   | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/artifact/out/patch-site/book.html |
   | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/artifact/out/patch-unit-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/testReport/ |
   | Max. process+thread count | 5467 (vs. ulimit of 10000) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-978/7/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r363605168
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -18,73 +18,45 @@
  */
 package org.apache.hadoop.hbase.master.normalizer;
 
-import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
 import java.util.Comparator;
 import java.util.List;
 import org.apache.hadoop.hbase.HBaseConfiguration;
 import org.apache.hadoop.hbase.HBaseIOException;
-import org.apache.hadoop.hbase.RegionMetrics;
-import org.apache.hadoop.hbase.ServerName;
-import org.apache.hadoop.hbase.Size;
 import org.apache.hadoop.hbase.TableName;
-import org.apache.hadoop.hbase.client.MasterSwitchType;
 import org.apache.hadoop.hbase.client.RegionInfo;
-import org.apache.hadoop.hbase.client.TableDescriptor;
-import org.apache.hadoop.hbase.master.MasterRpcServices;
-import org.apache.hadoop.hbase.master.MasterServices;
 import org.apache.hadoop.hbase.master.normalizer.NormalizationPlan.PlanType;
 import org.apache.yetus.audience.InterfaceAudience;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-import org.apache.hadoop.hbase.shaded.protobuf.RequestConverter;
-
 /**
- * Simple implementation of region normalizer.
- *
- * Logic in use:
- *
- *  <ol>
- *  <li> Get all regions of a given table
- *  <li> Get avg size S of each region (by total size of store files reported in RegionMetrics)
- *  <li> Seek every single region one by one. If a region R0 is bigger than S * 2, it is
- *  kindly requested to split. Thereon evaluate the next region R1
- *  <li> Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge.
- *  Thereon evaluate the next region R2
- *  <li> Otherwise, R1 is evaluated
+ * Simple implementation of region normalizer. Logic in use:
+ * <ol>
+ * <li>Get all regions of a given table
+ * <li>Get avg size S of each region (by total size of store files reported in RegionMetrics)
+ * <li>Seek every single region one by one. If a region R0 is bigger than S * 2, it is kindly
+ * requested to split. Thereon evaluate the next region R1
+ * <li>Otherwise, if R0 + R1 is smaller than S, R0 and R1 are kindly requested to merge. Thereon
+ * evaluate the next region R2
+ * <li>Otherwise, R1 is evaluated
  * </ol>
  * <p>
- * Region sizes are coarse and approximate on the order of megabytes. Additionally,
- * "empty" regions (less than 1MB, with the previous note) are not merged away. This
- * is by design to prevent normalization from undoing the pre-splitting of a table.
+ * Region sizes are coarse and approximate on the order of megabytes. Additionally, "empty" regions
+ * (less than 1MB, with the previous note) are not merged away. This is by design to prevent
+ * normalization from undoing the pre-splitting of a table.
 
 Review comment:
   Do we say anywhere the difference between this and the new normalizer?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hbase] mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…

Posted by GitBox <gi...@apache.org>.
mnpoonia commented on a change in pull request #978: HBASE-22285 A normalizer which merges small size regions with adjacen…
URL: https://github.com/apache/hbase/pull/978#discussion_r362380303
 
 

 ##########
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
 ##########
 @@ -132,18 +107,6 @@ public int compare(NormalizationPlan plan1, NormalizationPlan plan2) {
       return null;
     }
     boolean splitEnabled = true, mergeEnabled = true;
-    try {
-      splitEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.SPLIT)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether split is enabled", e);
-    }
-    try {
-      mergeEnabled = masterRpcServices.isSplitOrMergeEnabled(null,
-        RequestConverter.buildIsSplitOrMergeEnabledRequest(MasterSwitchType.MERGE)).getEnabled();
-    } catch (org.apache.hbase.thirdparty.com.google.protobuf.ServiceException e) {
-      LOG.debug("Unable to determine whether merge is enabled", e);
-    }
 
 Review comment:
   Yes missed it. Thanks for pointing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services