You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2008/10/16 07:36:44 UTC

[jira] Commented: (HBASE-932) Regionserver restart

    [ https://issues.apache.org/jira/browse/HBASE-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640073#action_12640073 ] 

Andrew Purtell commented on HBASE-932:
--------------------------------------

Our service monitoring and recovery framework detects regionserver shutdowns and restarts them. Seems to work pretty well if the fatal fault was due to e.g. a transient DFS problem, related to loading maybe. Suggest there should be a fixed restart limit and some backoff if a restart is not successful. 

> Regionserver restart
> --------------------
>
>                 Key: HBASE-932
>                 URL: https://issues.apache.org/jira/browse/HBASE-932
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>
> If we drop a flush or we fail close a write-ahead log, we currently shutdown the regionserver (we fail because of hdfs usually).  Rather than shut themselves down, how about they restart?  The restart at least in the HBASE-930 might fix the issue shaking DFSClient so it gets sense again.  Even is HDFS is bad, it'll come around eventually.  The HRS restarting itself plus HBASE-926 fix will make for fast recovery.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.