You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/07/30 07:27:18 UTC

[GitHub] [hbase] virajjasani opened a new pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

virajjasani opened a new pull request #2172:
URL: https://github.com/apache/hbase/pull/2172


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667498969


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 57s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 59s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 16s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  master passed  |
   | -0 :warning: |  patch  |   7m  7s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 35s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 54s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 54s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 24s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 150m 19s |  hbase-server in the patch passed.  |
   |  |   | 175m  7s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux b4daf8bcc314 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/testReport/ |
   | Max. process+thread count | 4063 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463661594



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover1.java
##########
@@ -80,23 +83,18 @@ public static void tearDownAfterClass() throws Exception {
     TEST_UTIL.shutdownMiniCluster();
   }
 
-  @Before
-  public void setUp() throws Exception {
-    // Create a pre-split table just to populate some regions
-    TableName tableName = TableName.valueOf("testRegionMover");
-    Admin admin = TEST_UTIL.getAdmin();
-    if (admin.tableExists(tableName)) {
-      TEST_UTIL.deleteTable(tableName);
-    }
+  private void initTableRegions(TableName tableName) throws IOException {

Review comment:
       the `name.getMethodName()` call is available in `Before` annotated methods, so you should still do this in a shared setup method instead of call this in each test.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  @Rule
+  public TestName name = new TestName();
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  private void initTableRegions(TableName tableName) throws IOException {
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(tableName)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    TEST_UTIL.getAdmin().createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    final TableName tableName = TableName.valueOf(name.getMethodName());
+    initTableRegions(tableName);
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(tableName);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(tableName);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(tableName))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading {}", regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded, now Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+    TEST_UTIL.getAdmin().disableTable(tableName);

Review comment:
       should be in an `After` annotated method.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,211 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.TestName;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  @Rule
+  public TestName name = new TestName();
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  private void initTableRegions(TableName tableName) throws IOException {

Review comment:
       the `name.getMethodName()` call is available in `Before` annotated methods, so you should still do this in a shared setup method instead of call this in each test.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover1.java
##########
@@ -274,13 +292,17 @@ public void testRegionServerPort() {
     if (originalPort != null) {
       conf.set(HConstants.REGIONSERVER_PORT, originalPort);
     }
+    TEST_UTIL.getAdmin().disableTable(tableName);

Review comment:
       these should be in an `After` annotated method




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666399023


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 26s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 25s |  master passed  |
   | +1 :green_heart: |  compile  |   1m  7s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 14s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 44s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 21s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 11s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 17s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 131m 35s |  hbase-server in the patch passed.  |
   |  |   | 159m 15s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux ba7c9c468890 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1b9269de4d |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/testReport/ |
   | Max. process+thread count | 4218 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667445602


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 25s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   1m 57s |  master passed  |
   | -0 :warning: |  patch  |   2m  6s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 23s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  3s |  hbase-server: The patch generated 5 new + 12 unchanged - 2 fixed = 17 total (was 14)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m  8s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m  7s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 14s |  The patch does not generate ASF License warnings.  |
   |  |   |  32m  0s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 2d279defa66d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667169370


   > Precommit run is not getting scheduled, waiting logs are present for long time
   
   it's backlog waiting for executors to open up on the new ci host:
   
   https://ci-hadoop.apache.org/label/Hadoop/load-statistics
   
   looks like the queue size started to climb yesterday around midnight. peaked about 5 hours ago at ~300. is now sitting around 280.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463455807



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       okay ignore this now that fixing it is tracked in #2180




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666295141


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |  39m 18s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 18s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   7m 50s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 56s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 17s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 30s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 50s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 206m 45s |  hbase-server in the patch passed.  |
   |  |   | 279m 31s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 371415cc9ffa 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f07f30ae24 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/testReport/ |
   | Max. process+thread count | 3552 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] tedyu commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
tedyu commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r462555521



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(100);

Review comment:
       Maybe the interval can be a bit longer since maxWait is in seconds.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(100);
+      }
+    }
+    if (sameServer) {
+      LOG.error("Region: {} stuck on {} ,newServer={}", region.getRegionNameAsString(),
+        this.sourceServer, this.targetServer);
+    } else {
+      isSuccessfulScan(region);
+      LOG.info("Moved Region " + region.getRegionNameAsString() + " cost:" + String
+        .format("%.3f", (float) (EnvironmentEdgeManager.currentTime() - startTime) / 1000));
+      moved = true;
+      movedRegions.add(region);
+    }
+    return moved;
+  }
+
+  /**
+   * Tries to scan a row from passed region
+   */
+  private void isSuccessfulScan(RegionInfo region) throws IOException {
+    Scan scan = new Scan().withStartRow(region.getStartKey()).setRaw(true).setOneRowLimit()
+      .setMaxResultSize(1L).setCaching(1).setFilter(new FirstKeyOnlyFilter())
+      .setCacheBlocks(false);
+    try (Table table = conn.getTable(region.getTable());
+      ResultScanner scanner = table.getScanner(scan)) {
+      scanner.next();

Review comment:
       Should the scanner be closed after the call ?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -587,8 +483,12 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
         LOG.error("Interrupted while waiting for Thread to Complete " + e.getMessage(), e);
         throw e;
       } catch (ExecutionException e) {
-        LOG.error("Got Exception From Thread While moving region " + e.getMessage(), e);
-        throw e;
+        if (e.getCause() instanceof UnknownRegionException) {
+          LOG.debug("Ignore unknown region, it might have been split/merged.");

Review comment:
       Should this be logged at more prominent level ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-665858828






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey edited a comment on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey edited a comment on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667169370


   > Precommit run is not getting scheduled, waiting logs are present for long time
   
   it's backlog waiting for executors to open up on the new ci host:
   
   https://ci-hadoop.apache.org/label/Hadoop/load-statistics
   
   looks like the queue size started to climb yesterday around midnight CDT. peaked about 5 hours ago at ~300. is now sitting around 280.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463138413



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithoutAck.java
##########
@@ -0,0 +1,70 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions without Acknowledging.Usefule in case of RS shutdown as we might want to shut the
+ * RS down anyways and not abort on a stuck region. Improves movement performance
+ */
+@InterfaceAudience.Private
+class MoveWithoutAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithoutAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Admin admin;
+
+  MoveWithoutAck(Admin admin, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) {
+    this.admin = admin;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+  }
+
+  @Override
+  public Boolean call() {
+    try {
+      LOG.info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer,
+        targetServer);
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      LOG.info("Moved {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);

Review comment:
       s/Moved/Requested Move
   
   since we are expressly not checking if it worked.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -587,8 +483,12 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
         LOG.error("Interrupted while waiting for Thread to Complete " + e.getMessage(), e);
         throw e;
       } catch (ExecutionException e) {
-        LOG.error("Got Exception From Thread While moving region " + e.getMessage(), e);
-        throw e;
+        if (e.getCause() instanceof UnknownRegionException) {
+          LOG.info("Ignore unknown region, it might have been split/merged.");

Review comment:
       Include the name of the region in the log message.
   
   Also include the exception details at debug. otherwise if we get URE for some reason other than what we're expecting here it will be really hard for someone to figure out what is going on.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover.java
##########
@@ -457,4 +463,37 @@ public void testExcludeAndDecomServers() throws Exception {
       Collections.emptyList());
   }
 
+  @Test
+  public void testWithMergedRegions() throws Exception {

Review comment:
       test with split regions as well?
   
   test that we properly fail in ack mode if the problem is something other than URE?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463167959



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -587,8 +483,12 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
         LOG.error("Interrupted while waiting for Thread to Complete " + e.getMessage(), e);
         throw e;
       } catch (ExecutionException e) {
-        LOG.error("Got Exception From Thread While moving region " + e.getMessage(), e);
-        throw e;
+        if (e.getCause() instanceof UnknownRegionException) {
+          LOG.info("Ignore unknown region, it might have been split/merged.");

Review comment:
       Sure, let me revert it back to debug and Exception message is the only way to actually print region id here. Future contains only `Boolean` value otherwise.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666528112


   > is there a particular reason we're conflating formatting changes here?
   > 
   > are the qa failures related?
   
   QA failures are not relevant, tried locally and both builds have one common test failure: `TestReplicationEndpointWithMultipleAsyncWAL`, works quite well locally with the patch.
   
   I just thought of refactoring `MoveWithAck` and `MoveWithoutAck` as high level classes since RegionMover is quite huge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani closed pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani closed pull request #2172:
URL: https://github.com/apache/hbase/pull/2172


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463455659



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      HRegion hRegion = hRegions.get(1);
+      int startKey = 0;
+      int endKey = Integer.MAX_VALUE;
+      if (hRegion.getRegionInfo().getStartKey().length > 0) {
+        startKey = Bytes.toInt(hRegion.getRegionInfo().getStartKey());
+      }
+      if (hRegion.getRegionInfo().getEndKey().length > 0) {
+        endKey = Bytes.toInt(hRegion.getRegionInfo().getEndKey());
+      }
+      int midKey = startKey + (endKey - startKey) / 2;
+      admin.splitRegionAsync(hRegion.getRegionInfo().getRegionName(), Bytes.toBytes(midKey))
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 1, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testFailedRegionMove() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       okay ignore this now that fixing it is tracked in #2180 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463338779



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);

Review comment:
       nit: should be at debug and include the name of the region.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(1000);
+      }
+    }
+    if (sameServer) {
+      LOG.error("Region: {} stuck on {} ,newServer={}", region.getRegionNameAsString(),
+        this.sourceServer, this.targetServer);
+    } else {
+      isSuccessfulScan(region);
+      LOG.info("Moved Region " + region.getRegionNameAsString() + " cost:" + String
+        .format("%.3f", (float) (EnvironmentEdgeManager.currentTime() - startTime) / 1000));

Review comment:
       nit: specify the cost is in seconds

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);

Review comment:
       nit: should be param substitution

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -587,8 +485,14 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
         LOG.error("Interrupted while waiting for Thread to Complete " + e.getMessage(), e);
         throw e;
       } catch (ExecutionException e) {
-        LOG.error("Got Exception From Thread While moving region " + e.getMessage(), e);
-        throw e;
+        boolean ignoreFailure = ignoreRegionMoveFailure(e);
+        if (ignoreFailure) {
+          LOG.debug("Ignore region move failure, it might have been split/merged, "

Review comment:
       please just pass the exception to the logging system so it will include a stack trace rather than try to build the message into our log.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover.java
##########
@@ -83,12 +85,11 @@ public static void tearDownAfterClass() throws Exception {
   @Before
   public void setUp() throws Exception {
     // Create a pre-split table just to populate some regions
-    TableName tableName = TableName.valueOf("testRegionMover");
     Admin admin = TEST_UTIL.getAdmin();
-    if (admin.tableExists(tableName)) {
-      TEST_UTIL.deleteTable(tableName);
+    if (admin.tableExists(TABLE_NAME)) {

Review comment:
       this is race-y and won't work properly if tests are run in parallel. we should use a junit rule to get the name of the test method and use that for the table name.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      HRegion hRegion = hRegions.get(1);
+      int startKey = 0;
+      int endKey = Integer.MAX_VALUE;
+      if (hRegion.getRegionInfo().getStartKey().length > 0) {
+        startKey = Bytes.toInt(hRegion.getRegionInfo().getStartKey());
+      }
+      if (hRegion.getRegionInfo().getEndKey().length > 0) {
+        endKey = Bytes.toInt(hRegion.getRegionInfo().getEndKey());
+      }
+      int midKey = startKey + (endKey - startKey) / 2;
+      admin.splitRegionAsync(hRegion.getRegionInfo().getRegionName(), Bytes.toBytes(midKey))
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 1, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testFailedRegionMove() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 1000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");

Review comment:
       no newlines in log messages please

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       the javadoc for HBaseTestingUtility claims that getConnection isn't threadsafe. if that is true then we need to do a bunch of work to safely do this line.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());

Review comment:
       nit: should use parameter substitution

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(1000);
+      }
+    }
+    if (sameServer) {
+      LOG.error("Region: {} stuck on {} ,newServer={}", region.getRegionNameAsString(),

Review comment:
       nit: how long did this take?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -522,13 +420,13 @@ private void unloadRegions(ServerName server, List<ServerName> regionServers,
       while (counter < regionsToMove.size()) {

Review comment:
       nit: this should be a for each loop on the `regionsToMove` list

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(1000);
+      }
+    }
+    if (sameServer) {
+      LOG.error("Region: {} stuck on {} ,newServer={}", region.getRegionNameAsString(),
+        this.sourceServer, this.targetServer);
+    } else {
+      isSuccessfulScan(region);
+      LOG.info("Moved Region " + region.getRegionNameAsString() + " cost:" + String
+        .format("%.3f", (float) (EnvironmentEdgeManager.currentTime() - startTime) / 1000));
+      moved = true;
+      movedRegions.add(region);
+    }
+    return moved;
+  }
+
+  /**
+   * Tries to scan a row from passed region
+   */
+  private void isSuccessfulScan(RegionInfo region) throws IOException {
+    Scan scan = new Scan().withStartRow(region.getStartKey()).setRaw(true).setOneRowLimit()
+      .setMaxResultSize(1L).setCaching(1).setFilter(new FirstKeyOnlyFilter())
+      .setCacheBlocks(false);
+    try (Table table = conn.getTable(region.getTable());
+      ResultScanner scanner = table.getScanner(scan)) {
+      scanner.next();
+    } catch (IOException e) {
+      LOG.error("Could not scan region:" + region.getEncodedName(), e);

Review comment:
       nit: should use parameter substitution

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {

Review comment:
       this is race-y and won't work properly if tests are run in parallel. we should use a junit rule to get the name of the test method and use that for the table name.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       the javadoc for `HBaseTestingUtility` claims that `getConnection` isn't threadsafe. if that is true then we need to do a bunch of work to safely do this line.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      HRegion hRegion = hRegions.get(1);
+      int startKey = 0;
+      int endKey = Integer.MAX_VALUE;
+      if (hRegion.getRegionInfo().getStartKey().length > 0) {
+        startKey = Bytes.toInt(hRegion.getRegionInfo().getStartKey());
+      }
+      if (hRegion.getRegionInfo().getEndKey().length > 0) {
+        endKey = Bytes.toInt(hRegion.getRegionInfo().getEndKey());
+      }
+      int midKey = startKey + (endKey - startKey) / 2;
+      admin.splitRegionAsync(hRegion.getRegionInfo().getRegionName(), Bytes.toBytes(midKey))
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 1, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testFailedRegionMove() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       the javadoc for HBaseTestingUtility claims that getConnection isn't threadsafe. if that is true then we need to do a bunch of work to safely do this line.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      HRegion hRegion = hRegions.get(1);
+      int startKey = 0;
+      int endKey = Integer.MAX_VALUE;
+      if (hRegion.getRegionInfo().getStartKey().length > 0) {
+        startKey = Bytes.toInt(hRegion.getRegionInfo().getStartKey());
+      }
+      if (hRegion.getRegionInfo().getEndKey().length > 0) {
+        endKey = Bytes.toInt(hRegion.getRegionInfo().getEndKey());
+      }
+      int midKey = startKey + (endKey - startKey) / 2;
+      admin.splitRegionAsync(hRegion.getRegionInfo().getRegionName(), Bytes.toBytes(midKey))
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 1, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testFailedRegionMove() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 1000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());

Review comment:
       nit: should use parameter substitution

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");

Review comment:
       no newlines in log messages please

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());

Review comment:
       nit: should use parameter substitution

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");

Review comment:
       no newlines in log messages please.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -597,6 +501,22 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
     }
   }
 
+  private boolean ignoreRegionMoveFailure(ExecutionException e) {
+    boolean ignoreFailure;

Review comment:
       nit: initialize to false and drop the final else clause.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667462543


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 11s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 42s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 55s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 30s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  master passed  |
   | -0 :warning: |  patch  |   6m 20s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 29s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 54s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 54s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 38s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 147m 13s |  hbase-server in the patch passed.  |
   |  |   | 172m  3s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux b062846db5cc 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/testReport/ |
   | Max. process+thread count | 4576 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667483367


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 41s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  1s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 19s |  master passed  |
   | -0 :warning: |  patch  |   2m 27s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 54s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m 19s |  hbase-server: The patch generated 3 new + 14 unchanged - 0 fixed = 17 total (was 14)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  13m 17s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 16s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 14s |  The patch does not generate ASF License warnings.  |
   |  |   |  37m 44s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux daff207dec36 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666409082


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   4m  5s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 17s |  master passed  |
   | +1 :green_heart: |  compile  |   1m  1s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 21s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 14s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 27s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 154m 28s |  hbase-server in the patch failed.  |
   |  |   | 185m 40s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux e3a647660059 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1b9269de4d |
   | Default Java | 1.8.0_232 |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/testReport/ |
   | Max. process+thread count | 4182 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666329801


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   4m  8s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 11s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 24s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 55s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m 16s |  hbase-server: The patch generated 2 new + 3 unchanged - 0 fixed = 5 total (was 3)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  13m 39s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 56s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 15s |  The patch does not generate ASF License warnings.  |
   |  |   |  43m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux b86b983722f9 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 1b9269de4d |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463671614



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover1.java
##########
@@ -80,23 +83,18 @@ public static void tearDownAfterClass() throws Exception {
     TEST_UTIL.shutdownMiniCluster();
   }
 
-  @Before
-  public void setUp() throws Exception {
-    // Create a pre-split table just to populate some regions
-    TableName tableName = TableName.valueOf("testRegionMover");
-    Admin admin = TEST_UTIL.getAdmin();
-    if (admin.tableExists(tableName)) {
-      TEST_UTIL.deleteTable(tableName);
-    }
+  private void initTableRegions(TableName tableName) throws IOException {

Review comment:
       Oh, I wasn't aware of this. Sounds really great. I was also not happy to keep repeating the same code for each method.
   Thanks, will update the PR shortly.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666521173


   is there a particular reason we're conflating formatting changes here?
   
   are the qa failures related?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666528956


   > I just thought of refactoring MoveWithAck and MoveWithoutAck as high level classes since RegionMover is quite huge.
   
   that's a good idea; please make sure there is a note about that in the commit message body.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r463400925



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestRegionMover2.java
##########
@@ -0,0 +1,202 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtility;
+import org.apache.hadoop.hbase.MiniHBaseCluster;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
+import org.apache.hadoop.hbase.client.Put;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.client.TableDescriptor;
+import org.apache.hadoop.hbase.client.TableDescriptorBuilder;
+import org.apache.hadoop.hbase.regionserver.HRegion;
+import org.apache.hadoop.hbase.regionserver.HRegionServer;
+import org.apache.hadoop.hbase.testclassification.LargeTests;
+import org.apache.hadoop.hbase.testclassification.MiscTests;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+/**
+ * Tests for Region Mover Load/Unload functionality with and without ack mode and also to test
+ * exclude functionality useful for rack decommissioning
+ */
+@Category({ MiscTests.class, LargeTests.class})
+public class TestRegionMover2 {
+
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestRegionMover2.class);
+
+  private static final Logger LOG = LoggerFactory.getLogger(TestRegionMover2.class);
+
+  private static final HBaseTestingUtility TEST_UTIL = new HBaseTestingUtility();
+
+  private static final TableName TABLE_NAME = TableName.valueOf("testRegionMover2");
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+    TEST_UTIL.startMiniCluster(3);
+    TEST_UTIL.getAdmin().balancerSwitch(false, true);
+  }
+
+  @AfterClass
+  public static void tearDownAfterClass() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  @Before
+  public void setUp() throws Exception {
+    // Create a pre-split table just to populate some regions
+    Admin admin = TEST_UTIL.getAdmin();
+    if (admin.tableExists(TABLE_NAME)) {
+      TEST_UTIL.deleteTable(TABLE_NAME);
+    }
+    TableDescriptor tableDesc = TableDescriptorBuilder.newBuilder(TABLE_NAME)
+      .setColumnFamily(ColumnFamilyDescriptorBuilder.of("fam1")).build();
+    int startKey = 0;
+    int endKey = 80000;
+    admin.createTable(tableDesc, Bytes.toBytes(startKey), Bytes.toBytes(endKey), 9);
+  }
+
+  @Test
+  public void testWithMergedRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 0; i < 10000; i++) {
+      puts.add(new Put(Bytes.toBytes("rowkey_" + i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      admin.mergeRegionsAsync(new byte[][] { hRegions.get(0).getRegionInfo().getRegionName(),
+        hRegions.get(1).getRegionInfo().getRegionName() }, true)
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 2, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testWithSplitRegions() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);
+    List<Put> puts = new ArrayList<>();
+    for (int i = 10; i < 50000; i++) {
+      puts.add(new Put(Bytes.toBytes(i))
+        .addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("q1"), Bytes.toBytes("val_" + i)));
+    }
+    table.put(puts);
+    admin.flush(TABLE_NAME);
+    admin.compact(TABLE_NAME);
+    HRegionServer regionServer = cluster.getRegionServer(0);
+    String rsName = regionServer.getServerName().getAddress().toString();
+    int numRegions = regionServer.getNumberOfOnlineRegions();
+    List<HRegion> hRegions = regionServer.getRegions().stream()
+      .filter(hRegion -> hRegion.getRegionInfo().getTable().equals(TABLE_NAME))
+      .collect(Collectors.toList());
+
+    RegionMover.RegionMoverBuilder rmBuilder =
+      new RegionMover.RegionMoverBuilder(rsName, TEST_UTIL.getConfiguration()).ack(true)
+        .maxthreads(8);
+    try (RegionMover rm = rmBuilder.build()) {
+      LOG.debug("Unloading " + regionServer.getServerName());
+      rm.unload();
+      Assert.assertEquals(0, regionServer.getNumberOfOnlineRegions());
+      LOG.debug("Successfully Unloaded\nNow Loading");
+      HRegion hRegion = hRegions.get(1);
+      int startKey = 0;
+      int endKey = Integer.MAX_VALUE;
+      if (hRegion.getRegionInfo().getStartKey().length > 0) {
+        startKey = Bytes.toInt(hRegion.getRegionInfo().getStartKey());
+      }
+      if (hRegion.getRegionInfo().getEndKey().length > 0) {
+        endKey = Bytes.toInt(hRegion.getRegionInfo().getEndKey());
+      }
+      int midKey = startKey + (endKey - startKey) / 2;
+      admin.splitRegionAsync(hRegion.getRegionInfo().getRegionName(), Bytes.toBytes(midKey))
+        .get(5, TimeUnit.SECONDS);
+      Assert.assertTrue(rm.load());
+      Assert.assertEquals(numRegions - 1, regionServer.getNumberOfOnlineRegions());
+    }
+  }
+
+  @Test
+  public void testFailedRegionMove() throws Exception {
+    MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
+    Admin admin = TEST_UTIL.getAdmin();
+    Table table = TEST_UTIL.getConnection().getTable(TABLE_NAME);

Review comment:
       okay this bothers me. hang on I think I have a patch to fix it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666771437


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m  1s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  2s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m  7s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 43s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m 10s |  hbase-server: The patch generated 5 new + 12 unchanged - 2 fixed = 17 total (was 14)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  12m 14s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 16s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 13s |  The patch does not generate ASF License warnings.  |
   |  |   |  35m 39s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux ac8a1261968b 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 840a55761b |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 84 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666862335


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 53s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 22s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 55s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 30s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 22s |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 55s |  the patch passed  |
   | +1 :green_heart: |  javac  |   0m 55s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 29s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 141m  6s |  hbase-server in the patch failed.  |
   |  |   | 164m 53s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux afea1db963d9 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 840a55761b |
   | Default Java | 1.8.0_232 |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/testReport/ |
   | Max. process+thread count | 5021 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666860632


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 35s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 19s |  master passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 43s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 43s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  2s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 46s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 132m 55s |  hbase-server in the patch failed.  |
   |  |   | 158m 54s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 2d7505d4ea50 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 840a55761b |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/testReport/ |
   | Max. process+thread count | 4449 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#discussion_r462758840



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionMover.java
##########
@@ -587,8 +483,12 @@ private void waitMoveTasksToFinish(ExecutorService moveRegionsPool,
         LOG.error("Interrupted while waiting for Thread to Complete " + e.getMessage(), e);
         throw e;
       } catch (ExecutionException e) {
-        LOG.error("Got Exception From Thread While moving region " + e.getMessage(), e);
-        throw e;
+        if (e.getCause() instanceof UnknownRegionException) {
+          LOG.debug("Ignore unknown region, it might have been split/merged.");

Review comment:
       sure, let me use `info`

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(100);

Review comment:
       Agree, this was old code anyways, we can update this to 1 sec at least.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/util/MoveWithAck.java
##########
@@ -0,0 +1,153 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hbase.util;
+
+import org.apache.hadoop.hbase.HRegionLocation;
+import org.apache.hadoop.hbase.ServerName;
+import org.apache.hadoop.hbase.client.Admin;
+import org.apache.hadoop.hbase.client.Connection;
+import org.apache.hadoop.hbase.client.RegionInfo;
+import org.apache.hadoop.hbase.client.ResultScanner;
+import org.apache.hadoop.hbase.client.Scan;
+import org.apache.hadoop.hbase.client.Table;
+import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter;
+import org.apache.yetus.audience.InterfaceAudience;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.Callable;
+
+/**
+ * Move Regions and make sure that they are up on the target server.If a region movement fails we
+ * exit as failure
+ */
+@InterfaceAudience.Private
+class MoveWithAck implements Callable<Boolean> {
+
+  private static final Logger LOG = LoggerFactory.getLogger(MoveWithAck.class);
+
+  private final RegionInfo region;
+  private final ServerName targetServer;
+  private final List<RegionInfo> movedRegions;
+  private final ServerName sourceServer;
+  private final Connection conn;
+  private final Admin admin;
+
+  MoveWithAck(Connection conn, RegionInfo regionInfo, ServerName sourceServer,
+    ServerName targetServer, List<RegionInfo> movedRegions) throws IOException {
+    this.conn = conn;
+    this.region = regionInfo;
+    this.targetServer = targetServer;
+    this.movedRegions = movedRegions;
+    this.sourceServer = sourceServer;
+    this.admin = conn.getAdmin();
+  }
+
+  @Override
+  public Boolean call() throws IOException, InterruptedException {
+    boolean moved = false;
+    int count = 0;
+    int retries = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_RETRIES_MAX_KEY, RegionMover.DEFAULT_MOVE_RETRIES_MAX);
+    int maxWaitInSeconds = admin.getConfiguration()
+      .getInt(RegionMover.MOVE_WAIT_MAX_KEY, RegionMover.DEFAULT_MOVE_WAIT_MAX);
+    long startTime = EnvironmentEdgeManager.currentTime();
+    boolean sameServer = true;
+    // Assert we can scan the region in its current location
+    isSuccessfulScan(region);
+    LOG
+      .info("Moving region: {} from {} to {}", region.getEncodedName(), sourceServer, targetServer);
+    while (count < retries && sameServer) {
+      if (count > 0) {
+        LOG.info("Retry " + count + " of maximum " + retries);
+      }
+      count = count + 1;
+      admin.move(region.getEncodedNameAsBytes(), targetServer);
+      long maxWait = startTime + (maxWaitInSeconds * 1000);
+      while (EnvironmentEdgeManager.currentTime() < maxWait) {
+        sameServer = isSameServer(region, sourceServer);
+        if (!sameServer) {
+          break;
+        }
+        Thread.sleep(100);
+      }
+    }
+    if (sameServer) {
+      LOG.error("Region: {} stuck on {} ,newServer={}", region.getRegionNameAsString(),
+        this.sourceServer, this.targetServer);
+    } else {
+      isSuccessfulScan(region);
+      LOG.info("Moved Region " + region.getRegionNameAsString() + " cost:" + String
+        .format("%.3f", (float) (EnvironmentEdgeManager.currentTime() - startTime) / 1000));
+      moved = true;
+      movedRegions.add(region);
+    }
+    return moved;
+  }
+
+  /**
+   * Tries to scan a row from passed region
+   */
+  private void isSuccessfulScan(RegionInfo region) throws IOException {
+    Scan scan = new Scan().withStartRow(region.getStartKey()).setRaw(true).setOneRowLimit()
+      .setMaxResultSize(1L).setCaching(1).setFilter(new FirstKeyOnlyFilter())
+      .setCacheBlocks(false);
+    try (Table table = conn.getTable(region.getTable());
+      ResultScanner scanner = table.getScanner(scan)) {
+      scanner.next();

Review comment:
       Yes, it is getting closed because we have wrapped it within try-with-resources above:
   ```
       try (Table table = conn.getTable(region.getTable());
         ResultScanner scanner = table.getScanner(scan)) {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
virajjasani commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667079180


   Precommit run is not getting scheduled, waiting logs are present for long time:
   ```
   Still waiting to schedule task
   Waiting for next available executor on ‘Hadoop’
   ```
   
   I tried manual trigger but the run is still not scheduled.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-666310797


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  | 132m 42s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m  7s |  master passed  |
   | +1 :green_heart: |  compile  |   0m 57s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 47s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 13s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  3s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 45s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 41s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  | 157m 31s |  hbase-server in the patch failed.  |
   |  |   | 317m 37s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 048288861e61 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f07f30ae24 |
   | Default Java | 1.8.0_232 |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/testReport/ |
   | Max. process+thread count | 4290 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667463449


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 38s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m  5s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   8m  0s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 52s |  hbase-server in master failed.  |
   | -0 :warning: |  patch  |   9m  2s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   8m  2s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 47s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 147m 33s |  hbase-server in the patch passed.  |
   |  |   | 182m  1s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux d89b95a5e414 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/testReport/ |
   | Max. process+thread count | 4190 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2172: HBASE-24795 : RegionMover to deal with unknown region while (un)loading

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2172:
URL: https://github.com/apache/hbase/pull/2172#issuecomment-667499879


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 42s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m  0s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   7m  0s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 47s |  hbase-server in master failed.  |
   | -0 :warning: |  patch  |   7m 58s |  Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary.  |
   ||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   4m 50s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 21s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 23s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 46s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  | 148m  7s |  hbase-server in the patch passed.  |
   |  |   | 180m 59s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2172 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux b054d8c5bb71 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 7c4d66a277 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/testReport/ |
   | Max. process+thread count | 4485 (vs. ulimit of 12500) |
   | modules | C: hbase-server U: hbase-server |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2172/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org