You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/06/22 17:27:06 UTC

[GitHub] [hbase] saintstack commented on a change in pull request #1924: HBASE-24552 Replica region needs to check if primary region directory…

saintstack commented on a change in pull request #1924:
URL: https://github.com/apache/hbase/pull/1924#discussion_r443714241



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
##########
@@ -338,6 +345,35 @@ protected Flow executeFromState(MasterProcedureEnv env, RegionStateTransitionSta
     try {
       switch (state) {
         case REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE:
+
+          // Need to do some sanity check for replica region, if the region does not exist at file
+          // system, do not try to assign the replica region, log error and return.
+          // Do not rely on master's in-memory state, primary region got its own life, it can be
+          // closed, offline for various reasons.

Review comment:
       This is a mistake then? Master is the authority. If it can't make a call on read replicas, then read replicas are not done right?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
##########
@@ -338,6 +345,35 @@ protected Flow executeFromState(MasterProcedureEnv env, RegionStateTransitionSta
     try {
       switch (state) {
         case REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE:
+
+          // Need to do some sanity check for replica region, if the region does not exist at file
+          // system, do not try to assign the replica region, log error and return.
+          // Do not rely on master's in-memory state, primary region got its own life, it can be
+          // closed, offline for various reasons.
+          if (!RegionReplicaUtil.isDefaultReplica(regionNode.getRegionInfo())) {
+            MasterFileSystem mfs = env.getMasterServices().getMasterFileSystem();
+            RegionInfo replicaRI = regionNode.getRegionInfo();
+            Path tableDir = CommonFSUtils.getTableDir(mfs.getRootDir(), regionNode.getTable());
+            Path regionDir = FSUtils.getRegionDirFromTableDir(tableDir, replicaRI);
+            FileSystem fs = mfs.getFileSystem();
+            // Check if primary region directory exists
+            if (!fs.exists(regionDir)) {
+              LOG.error(
+                "Cannot assign replica region {} because its primary region {} does not exist"
+                  + " at Filesystem", replicaRI,
+                ServerRegionReplicaUtil.getRegionInfoForDefaultReplica(replicaRI));
+              return Flow.NO_MORE_STATE;
+            }
+            // check if .regionInfo exists in primary region
+            Path regionInfoFile = new Path(regionDir, HRegionFileSystem.REGION_INFO_FILE);
+            if (!fs.exists(regionInfoFile)) {
+              LOG.error(
+                "Cannot assign replica region {} because region info file does not exist in its"
+                  + " primary region {}", replicaRI,
+                ServerRegionReplicaUtil.getRegionInfoForDefaultReplica(replicaRI));
+              return Flow.NO_MORE_STATE;
+            }
+          }

Review comment:
       Ouch.
   
   We need follow-on to fix read replicas so Master is authority?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org