You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/03/27 19:25:13 UTC
[GitHub] [hbase] saintstack commented on a change in pull request #1373: HBASE-24052 Add debug+fix to TestMasterShutdown

saintstack commented on a change in pull request #1373: HBASE-24052 Add debug+fix to TestMasterShutdown
URL: https://github.com/apache/hbase/pull/1373#discussion_r399491856
 
 

 ##########
 File path: hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java
 ##########
 @@ -156,48 +151,19 @@ public void testMasterShutdownBeforeStartingAnyRegionServer() throws Exception {
       hbaseCluster = new LocalHBaseCluster(htu.getConfiguration(), options.getNumMasters(),
         options.getNumRegionServers(), options.getMasterClass(), options.getRsClass());
       final MasterThread masterThread = hbaseCluster.getMasters().get(0);
-
-      final CompletableFuture<Void> shutdownFuture = CompletableFuture.runAsync(() -> {
-        // Switching to master registry exacerbated a race in the master bootstrap that can result
-        // in a lost shutdown command (HBASE-8422, HBASE-23836). The race is essentially because
-        // the server manager in HMaster is not initialized by the time shutdown() RPC (below) is
-        // made to the master. The suspected reason as to why it was uncommon before HBASE-18095
-        // is because the connection creation with ZK registry is so slow that by then the server
-        // manager is usually init'ed in time for the RPC to be made. For now, adding an explicit
-        // wait() in the test, waiting for the server manager to become available.
-        final long timeout = TimeUnit.MINUTES.toMillis(10);
-        assertNotEquals("timeout waiting for server manager to become available.",
-          -1, Waiter.waitFor(htu.getConfiguration(), timeout,
-            () -> masterThread.getMaster().getServerManager() != null));
-
-        // Master has come up far enough that we can terminate it without creating a zombie.
-        final long result = Waiter.waitFor(htu.getConfiguration(), timeout, 500, () -> {
-          final Configuration conf = createResponsiveZkConfig(htu.getConfiguration());
-          LOG.debug("Attempting to establish connection.");
-          final CompletableFuture<AsyncConnection> connFuture =
-            ConnectionFactory.createAsyncConnection(conf);
-          try (final AsyncConnection conn = connFuture.join()) {
-            LOG.debug("Sending shutdown RPC.");
-            try {
-              conn.getAdmin().shutdown().join();
-              LOG.debug("Shutdown RPC sent.");
-              return true;
-            } catch (CompletionException e) {
-              LOG.debug("Failure sending shutdown RPC.");
-            }
-          } catch (IOException|CompletionException e) {
-            LOG.debug("Failed to establish connection.");
-          } catch (Throwable e) {
-            LOG.info("Something unexpected happened.", e);
-          }
-          return false;
-        });
-        assertNotEquals("Failed to issue shutdown RPC after " + Duration.ofMillis(timeout),
-          -1, result);
-      });
-
       masterThread.start();
-      shutdownFuture.join();
+      // Switching to master registry exacerbated a race in the master bootstrap that can result
+      // in a lost shutdown command (HBASE-8422, HBASE-23836). The race is essentially because
+      // the server manager in HMaster is not initialized by the time shutdown() RPC (below) is
+      // made to the master. The suspected reason as to why it was uncommon before HBASE-18095
+      // is because the connection creation with ZK registry is so slow that by then the server
+      // manager is usually init'ed in time for the RPC to be made. For now, adding an explicit
+      // wait() in the test, waiting for the server manager to become available.
+      final long timeout = TimeUnit.MINUTES.toMillis(10);
+      assertNotEquals("Timeout waiting for server manager to become available.",
+        -1, Waiter.waitFor(htu.getConfiguration(), timeout,
+          () -> masterThread.getMaster().getServerManager() != null));
+      htu.getConnection().getAdmin().shutdown();
 
 Review comment:
   Thanks for taking a look @bharathv .
   
   Yeah, something up. Looking at what is happening inline in the rpc request seems minimal -- setting flags -- but something is not right... I filed HBASE-24070, an issue to look at this, as you suggest above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services