You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org> on 2012/03/13 00:24:39 UTC

[jira] [Commented] (HADOOP-8163) Improve ActiveStandbyElector to provide hooks for fencing old active

    [ https://issues.apache.org/jira/browse/HADOOP-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228043#comment-13228043 ] 

Todd Lipcon commented on HADOOP-8163:
-------------------------------------

The design here is pretty simple:

*In ZK*:
- add an additional znode (the "info" znode) next to the "lock" znode, which is a PERSISTENT node with the same data.

*Upon successfully acquiring the lock znode:*
- check if there exists an "info" znode
-- if so, the previous active did not exit cleanly. Call an application-provided fencing hook, providing the data from the "info" znode
-- If the fencing hook succeeds, delete the "info" znode
- create an "info" znode with one's own app data
- proceed to call the {{becomeActive}} API on the app

*Upon crashing:*
- the ephemeral node disappears
- by the order of events above, if the application has become active, then it will have created an "info" znode so whoever recovers knows to fence it

*Upon graceful exit:*
- first transition out of "active" mode (e.g. shutdown the NN)
- then delete the "info" node
- then close the session (deleting the ephemeral node)


                
> Improve ActiveStandbyElector to provide hooks for fencing old active
> --------------------------------------------------------------------
>
>                 Key: HADOOP-8163
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8163
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ha
>    Affects Versions: 0.24.0, 0.23.3
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>
> When a new node becomes active in an HA setup, it may sometimes have to take fencing actions against the node that was formerly active. This JIRA extends the ActiveStandbyElector which adds an extra non-ephemeral node into the ZK directory, which acts as a second copy of the active node's information. Then, if the active loses its ZK session, the next active to be elected may easily locate the unfenced node to take the appropriate actions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira