You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "junhua yang (Commented) (JIRA)" <ji...@apache.org> on 2012/02/08 10:46:59 UTC

[jira] [Commented] (HBASE-5075) regionserver crashed and failover

    [ https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203425#comment-13203425 ] 

junhua yang commented on HBASE-5075:
------------------------------------

hi,
I think it is very important to shortent the recovery time.
Now waiting the regionserver recovery sometimes is very long and not acceptable for online service.
Lots of  error from client will be thrown and affect the custom.

So could you help to provide your solution and   @stack,  how do you think about hbase failover solution now?

Do you have any plan to improve it?

                
> regionserver crashed and failover
> ---------------------------------
>
>                 Key: HBASE-5075
>                 URL: https://issues.apache.org/jira/browse/HBASE-5075
>             Project: HBase
>          Issue Type: Improvement
>          Components: monitoring, regionserver, replication, zookeeper
>    Affects Versions: 0.92.1
>            Reporter: 代志远
>
> regionserver crashed,it is too long time to notify hmaster.when hmaster know regionserver's shutdown,it is long time to fetch the hlog's lease.
> hbase is a online db, availability is very important.
> i have a idea to improve availability, monitor node to check regionserver's pid.if this pid not exsits,i think the rs down,i will delete the znode,and force close the hlog file.
> so the period maybe 100ms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira