You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Todd Lipcon (Commented) (JIRA)" <ji...@apache.org> on 2012/01/05 01:38:39 UTC
[jira] [Commented] (HADOOP-7924)
FailoverController for client-based configuration
[ https://issues.apache.org/jira/browse/HADOOP-7924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180045#comment-13180045 ]
Todd Lipcon commented on HADOOP-7924:
-------------------------------------
in {{preFailoverChecks}}, I think it's clearer to structure the code like:
{code}
HAServiceState toSvcState;
try {
toSvcState = toSvc.getServiceState();
} catch (Exception e) {
// throw the FailoverFailed
}
// now check toSvcState.equals(STANDBY)
{code}
rather than trying to collapse both exceptions into one throw. Also, should log the exception thrown by getServiceState.
----
In {{failover()}}, I think you probably want to catch all Throwables in another catch clause - eg what if it's in some bad state and your failover attempt caused it to crash, which would give your IPC a SocketTimeoutException.
----
{code}
+ public FailoverFailedException(String message, Throwable cause) {
+ super(message, cause);
{code}
indentation
----
{code}
+ new UsageInfo("<host:port> <host:port>",
+ "Failover from the 1st daemon to the 2nd"))
{code}
I think better to not abbreviate "first" and "second"
----
- Can you add some javadoc to {{testManualFailoverCanResultInTwoActives}} -- it's strange that this is a test case... it's more like you're showing that a particular user error can cause a problem, rather than showing something about the bug, right? Or else it should be a test case that fails, with an @Ignore explaining why it fails, maybe?
- Just to confirm, the manual test you mentioned was done with two NNs in a running HA cluster?
>
FailoverController for client-based configuration
> --------------------------------------------------
>
> Key: HADOOP-7924
> URL: https://issues.apache.org/jira/browse/HADOOP-7924
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: ha
> Affects Versions: HA Branch (HDFS-1623)
> Reporter: Eli Collins
> Assignee: Eli Collins
> Attachments: hadoop-7924.txt, hadoop-7924.txt
>
>
> Basic FailoverController to coordinate fail-over using a client-based config (ie fail-over from NameNode x to NameNode y).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira