You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2021/11/26 10:17:57 UTC

[GitHub] [hbase] virajjasani commented on a change in pull request #3875: HBASE-26459 HMaster should move non-meta region only if meta is ONLINE

virajjasani commented on a change in pull request #3875:
URL: https://github.com/apache/hbase/pull/3875#discussion_r757379561



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1845,12 +1851,22 @@ public void move(final byte[] encodedRegionName,
       // closed
       serverManager.sendRegionWarmup(rp.getDestination(), hri);
 
+      // Here wait until all the meta regions are not in transition.
+      if (!hri.isMetaRegion() && assignmentManager.getRegionStates().isMetaRegionInTransition()) {
+        Thread.sleep(timeoutWaitMetaRegionAssignment);
+        if (assignmentManager.getRegionStates().isMetaRegionInTransition()) {
+          LOG.info("Moving " + rp + " failed. " +
+            "While there is still meta regions in transition after " +
+            timeoutWaitMetaRegionAssignment + "ms waiting.");

Review comment:
       I think ERROR log would be better. Also, how about this error message?
   
   `LOG.error("This is fail-fast of the region move because hbase:meta region is in transition. Failed region move info: " + rp);`

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
##########
@@ -188,5 +192,48 @@ public void testMoveThrowsPleaseHoldException() throws IOException {
       TEST_UTIL.deleteTable(tableName);
     }
   }
+
+  @Test (timeout = 300000)
+  public void testMoveRegionWhenMetaRegionInTransition()
+    throws IOException, InterruptedException, KeeperException {
+    TableName tableName = TableName.valueOf("testMoveRegionWhenMetaRegionInTransition");
+    HMaster master = TEST_UTIL.getMiniHBaseCluster().getMaster();
+    HTableDescriptor htd = new HTableDescriptor(tableName);
+    HColumnDescriptor hcd = new HColumnDescriptor("value");
+    RegionStates regionStates = master.getAssignmentManager().getRegionStates();
+    htd.addFamily(hcd);
+
+    admin.createTable(htd, null);
+    try {
+      HRegionInfo hri = admin.getTableRegions(tableName).get(0);
+
+      HRegionInfo metaRegion = admin.getTableRegions(TableName.META_TABLE_NAME).get(0);
+
+      ServerName rs0 = TEST_UTIL.getHBaseCluster().getRegionServer(0).getServerName();
+      ServerName rs1 = TEST_UTIL.getHBaseCluster().getRegionServer(1).getServerName();
+
+      admin.move(hri.getEncodedNameAsBytes(), rs0.getServerName().getBytes());
+      while (regionStates.isRegionInTransition(hri)) {
+        // Make sure the region is not in transition
+        Thread.sleep(1000);
+      }
+      // Meta region should be in transition
+      master.assignmentManager.unassign(metaRegion);
+      //    master.assignmentManager.regionOffline(metaRegion);
+      // Then move the region to a new region server.
+      admin.move(hri.getEncodedNameAsBytes(), rs1.getServerName().getBytes());
+
+      // The region should be still on rs0.
+      TEST_UTIL.assertRegionOnServer(hri, rs0, 5000);

Review comment:
       Just before this line, put a wait for `HBASE_MASTER_WAITING_META_ASSIGNMENT_TIMEOUT_DEFAULT` ms

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
##########
@@ -188,5 +192,48 @@ public void testMoveThrowsPleaseHoldException() throws IOException {
       TEST_UTIL.deleteTable(tableName);
     }
   }
+
+  @Test (timeout = 300000)
+  public void testMoveRegionWhenMetaRegionInTransition()
+    throws IOException, InterruptedException, KeeperException {
+    TableName tableName = TableName.valueOf("testMoveRegionWhenMetaRegionInTransition");
+    HMaster master = TEST_UTIL.getMiniHBaseCluster().getMaster();
+    HTableDescriptor htd = new HTableDescriptor(tableName);
+    HColumnDescriptor hcd = new HColumnDescriptor("value");
+    RegionStates regionStates = master.getAssignmentManager().getRegionStates();
+    htd.addFamily(hcd);
+
+    admin.createTable(htd, null);
+    try {
+      HRegionInfo hri = admin.getTableRegions(tableName).get(0);
+
+      HRegionInfo metaRegion = admin.getTableRegions(TableName.META_TABLE_NAME).get(0);
+
+      ServerName rs0 = TEST_UTIL.getHBaseCluster().getRegionServer(0).getServerName();
+      ServerName rs1 = TEST_UTIL.getHBaseCluster().getRegionServer(1).getServerName();
+
+      admin.move(hri.getEncodedNameAsBytes(), rs0.getServerName().getBytes());
+      while (regionStates.isRegionInTransition(hri)) {
+        // Make sure the region is not in transition
+        Thread.sleep(1000);
+      }
+      // Meta region should be in transition
+      master.assignmentManager.unassign(metaRegion);
+      //    master.assignmentManager.regionOffline(metaRegion);
+      // Then move the region to a new region server.
+      admin.move(hri.getEncodedNameAsBytes(), rs1.getServerName().getBytes());
+
+      // The region should be still on rs0.
+      TEST_UTIL.assertRegionOnServer(hri, rs0, 5000);
+
+      // Wait until the meta region is reassigned.
+      admin.assign(metaRegion.getEncodedNameAsBytes());
+      while (regionStates.isMetaRegionInTransition()) {
+        Thread.sleep(1000);
+      }

Review comment:
       After this, we can again repeat the same test and try to move hri region and assert that the region is moved successfully.

##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
##########
@@ -136,6 +136,12 @@
   /** Default value for the max percent of regions in transition */
   public static final double DEFAULT_HBASE_MASTER_BALANCER_MAX_RIT_PERCENT = 1.0;
 
+  /** Time in milliseconds to wait meta region assignment, when moving non-meta regions. */
+  public static final String HBASE_MASTER_WAITING_META_ASSIGNMENT_TIMEOUT =
+    "hbase.master.waiting.meta.assignment.timeout";
+
+  public static final long HBASE_MASTER_WAITING_META_ASSIGNMENT_TIMEOUT_DEFAULT = 5000;

Review comment:
       Let's keep it 10000 (10 s)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org