You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2013/01/17 17:16:18 UTC

[jira] [Updated] (HADOOP-9220) Unnecessary transition to standby in ActiveStandbyElector

     [ https://issues.apache.org/jira/browse/HADOOP-9220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-9220:
------------------------------

    Attachment: HADOOP-9220.patch

The reason for this behaviour is because there can be multiple watchers registered for a given ZK client in ActiveStandbyElector. (The monitorLockNodeAsync() method creates a new watcher object for the existing ZK client.) 

This can cause multiple invocations of joinElectionInternal() for a single watch event, each of which will make a call to create the lock znode. The first call will cause the a transition to active, while subsequent ones will cause a transition to standby (in the isNodeExists clause of the  processResult() method). In a manual failover scenario the node will still transition to active again, since the other node has ceded from the election for 10s, but it's still an unnecessary transition that could be eliminated.

I did some manual testing with the attached patch, and the extra transition was avoided. I'll see if I can write a unit test for it.

                
> Unnecessary transition to standby in ActiveStandbyElector
> ---------------------------------------------------------
>
>                 Key: HADOOP-9220
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9220
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ha
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: HADOOP-9220.patch
>
>
> When performing a manual failover from one HA node to a second, under some circumstances the second node will transition from standby -> active -> standby -> active. This is with automatic failover enabled, so there is a ZK cluster doing leader election.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira