You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/06/16 11:33:47 UTC

[jira] [Updated] (HBASE-3995) HBASE-3946 broke TestMasterFailover

     [ https://issues.apache.org/jira/browse/HBASE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3995:
-------------------------

    Attachment: am.txt

Patch that passes the list of dead servers down to the location where we process whats up in zookeeper at time of new master's joining a cluster; the dead servers can be used to figure if a RIT came from a dead server and if so, we know there is no point in waiting on a CLOSING to complete or, if a disabled table, OPEN should go back and try and close the region that just OPENED on a server that just died.

> HBASE-3946 broke TestMasterFailover
> -----------------------------------
>
>                 Key: HBASE-3995
>                 URL: https://issues.apache.org/jira/browse/HBASE-3995
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.92.0
>
>         Attachments: am.txt
>
>
> TestMasterFailover is all about a new master coming up on an existing cluster.  Previous to HBASE-3946, the new master joining a cluster processing any dead servers would assign all regions found on the dead server even if they were split parents.  We don't want that.
> But TestMasterFailover mocks up some pretty interesting conditions.  The one we were failing on was that while the master was offine, we'd manually add a region to zk that was in CLOSING state.  We'd then go and disable the table up in zk (while master was offline).  Finally, we'd' kill the server that was supposed to be hosting the region from the disabled table in CLOSING state. Then we'd have the master join the cluster.  It had to figure it out.
> Before HBASE-3946, we'd just force offline every region that had been on the dead server.  This would call all to be assigned only on assign, regions from disabled tables are skipped, so it all "worked" (except would online parent of a split should there be one).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira