You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2019/02/01 14:14:00 UTC

[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

    [ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758339#comment-16758339 ] 

Hudson commented on HBASE-21325:
--------------------------------

Results for branch branch-2.0
	[build #1306 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1306/]: (x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1306//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1306//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1306//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Force to terminate regionserver when abort hang in somewhere
> ------------------------------------------------------------
>
>                 Key: HBASE-21325
>                 URL: https://issues.apache.org/jira/browse/HBASE-21325
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>            Reporter: Duo Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>             Fix For: 3.0.0, 1.5.0, 2.2.0
>
>         Attachments: HBASE-21325.master.001.patch, HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster to DA, while the local cluster is still in A, the region server will hang when shutdown. As the fsOk flag only test the local cluster(which is reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is broken(the remote wal directory is gone)  so we will never succeed. And this lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)