You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/08/10 23:21:17 UTC

[GitHub] [hbase] joshelser commented on a change in pull request #2113: HBASE-24286: HMaster won't become healthy after after cloning or crea…

joshelser commented on a change in pull request #2113:
URL: https://github.com/apache/hbase/pull/2113#discussion_r468235332



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
##########
@@ -1441,6 +1441,35 @@ public void processOfflineRegions() {
     }
   }
 
+  /**
+   * Create assign procedure for non-offline regions of enabled table that are assigned
+   * to `unknown` servers after hbase:meta is online.
+   *
+   * This is a special case when WAL directory, SCP WALs and ZK data are cleared,
+   * cluster restarts with hbase:meta table and other tables with storefiles.
+   */
+  public void processRegionsOnUnknownServers() {
+    List<RegionInfo> regionsOnUnknownServers = regionStates.getRegionStates().stream()
+      .filter(s -> !s.isOffline())
+      .filter(s -> isTableEnabled(s.getRegion().getTable()))
+      .filter(s -> !regionStates.isRegionInTransition(s.getRegion()))
+      .filter(s -> {
+        ServerName serverName = regionStates.getRegionServerOfRegion(s.getRegion());
+        if (serverName == null) {
+          return false;
+        }
+        return master.getServerManager().isServerKnownAndOnline(serverName)
+          .equals(ServerManager.ServerLiveState.UNKNOWN);

Review comment:
       Collapse these down into one method so we don't end up making 4 iterations over a list of (potentially) a lot of regions.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java
##########
@@ -71,7 +71,11 @@ private static void writeFsLayout(Path rootDir, Configuration conf) throws IOExc
     LOG.info("BOOTSTRAP: creating hbase:meta region");
     FileSystem fs = rootDir.getFileSystem(conf);
     Path tableDir = CommonFSUtils.getTableDir(rootDir, TableName.META_TABLE_NAME);
-    if (fs.exists(tableDir) && !fs.delete(tableDir, true)) {
+    boolean removeMeta = conf.getBoolean(HConstants.REMOVE_META_ON_RESTART,

Review comment:
       > we should not delete the META dir.
   
   Sorry for harping on an implementation detail: let's sideline meta and not delete please :).
   
   > Can we do this part alone in a sub task and a provide a patch pls? This is very key part..
   
   This seems like a very reasonable starting point. Like Anoop points out, if we can be very sure that we will only trigger this case when we are absolutely sure we're in the "cloud recreate" situation, that will bring a lot of confidence.
   
   > I will try to send another JIRA and PR out in a few days and refer to the conversation we discussed here.
   
   Lazy Josh: did you get a new Jira created already for this?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org