You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2022/07/20 03:20:16 UTC

[GitHub] [hbase] Apache9 commented on a diff in pull request #4633: HBASE-27223 Avoid data inconsistent between primary and secondary rep…

Apache9 commented on code in PR #4633:
URL: https://github.com/apache/hbase/pull/4633#discussion_r925129616


##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java:
##########
@@ -142,7 +142,7 @@
     "hbase.regionserver.wal.slowsync.roll.interval.ms";
   protected static final int DEFAULT_SLOW_SYNC_ROLL_INTERVAL_MS = 60 * 1000; // in ms, 1 minute
 
-  protected static final String WAL_SYNC_TIMEOUT_MS = "hbase.regionserver.wal.sync.timeout";
+  public static final String WAL_SYNC_TIMEOUT_MS = "hbase.regionserver.wal.sync.timeout";

Review Comment:
   This is a missing part in our design, usually, if here we get a timeout exception, the only correct way is to abort the region server, as the design of WAL sync, is to succeed or die, there is no 'failure'. It is usually not a big deal is because we set a very large default value here, 5 minutes, usually the WAL system will abort the region server if it can not finish the sync within 5 minutes...
   
   So I think we should throw a special IOException to upper layer, if we get this exception, we abort the region server.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@hbase.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org