You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2011/07/22 18:35:58 UTC
[jira] [Commented] (HBASE-3801) Backup Master blocked when the
HMaster Node Fail.
[ https://issues.apache.org/jira/browse/HBASE-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069607#comment-13069607 ]
Ted Yu commented on HBASE-3801:
-------------------------------
Normally patch should carry the JIRA number in its filename.
The patch changes the semantics of how ActiveMasterManager handles watcher.masterAddressZNode
Consequently the following assertion from TestActiveMasterManager would fail:
{code}
assertFalse(activeMasterManager.clusterHasActiveMaster.get());
{code}
Please produce a complete patch, run through the following unit tests and document the experience of testing failover in a real cluster:
{code}
src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
src/test/java/org/apache/hadoop/hbase/master/TestMaster.java
src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java
src/test/java/org/apache/hadoop/hbase/master/TestMasterStatusServlet.java
src/test/java/org/apache/hadoop/hbase/master/TestMasterTransitions.java
src/test/java/org/apache/hadoop/hbase/master/TestMasterRestartAfterDisablingTable.java
src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java
{code}
> Backup Master blocked when the HMaster Node Fail.
> -------------------------------------------------
>
> Key: HBASE-3801
> URL: https://issues.apache.org/jira/browse/HBASE-3801
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.2, 0.90.3
> Environment: 1 HMaster
> 1 HMaster -backup
> 6 HResignServer
> Reporter: Aaron Guo
> Attachments: patch.txt
>
>
> When the HMaster crash, the Backup HMaster blocked for waiting the ZK notify.
> The Backup HMaster's thread stack is :
> "master-hp1:60000" prio=10 tid=0x00000000484c6800 nid=0x4b56 waiting on condition [0x0000000040209000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.hbase.master.HMaster.stallIfBackupMaster(HMaster.java:251)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:279)
> Locked ownable synchronizers:
> - None
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira