You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2013/07/10 02:37:49 UTC

[jira] [Created] (HBASE-8910) TestReplicationDisableInactivePeer fails if the master we shutdown comes back to life

Jean-Daniel Cryans created HBASE-8910:
-----------------------------------------

             Summary: TestReplicationDisableInactivePeer fails if the master we shutdown comes back to life
                 Key: HBASE-8910
                 URL: https://issues.apache.org/jira/browse/HBASE-8910
             Project: HBase
          Issue Type: Bug
            Reporter: Jean-Daniel Cryans
            Assignee: Jean-Daniel Cryans
             Fix For: 0.98.0, 0.95.2, 0.94.10


Here's a case where TestReplicationDisableInactivePeer fails while re-starting the second master:

http://54.241.6.143/job/HBase-0.95-Hadoop-2/574/org.apache.hbase$hbase-server/testReport/junit/org.apache.hadoop.hbase.replication/TestReplicationDisableInactivePeer/testDisableInactivePeer/

The reason is that when we first shutdown the master, it comes back to life thinking it just lost its session:

{noformat}
2013-07-07 04:27:03,989 FATAL [pool-1-thread-1-EventThread] master.HMaster(2062): Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]
2013-07-07 04:27:03,989 INFO  [pool-1-thread-1-EventThread] master.HMaster(2155): Primary Master trying to recover from ZooKeeper session expiry.
{noformat}

And after that it tries to assign .META. fails since the RS are down.

One way I think we can prevent this is by skipping recovering the session if we are stopping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira