You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Xuan Gong (JIRA)" <ji...@apache.org> on 2015/09/03 00:22:47 UTC

[jira] [Commented] (YARN-4107) Both RM becomes Active if all zookeepers can not connect to active RM

    [ https://issues.apache.org/jira/browse/YARN-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728135#comment-14728135 ] 

Xuan Gong commented on YARN-4107:
---------------------------------

The ActiveStandbyElector would go to enterNeutralMode if it lost connection. At Zookeeper side, it would choose a new leader as the old leader lost connection. In that case, the old standby RM would become active, but the old active RM would keep trying to reconnect to Zookeeper until timeout. So, we would have two active RMs.

The bad impact is: all the new applications still try to connect to old active RM and stay in NEW state only. Because the old active RM have already lost the connection with ZK, so it can not save the app states in zk state store.

> Both RM becomes Active if all zookeepers can not connect to active RM
> ---------------------------------------------------------------------
>
>                 Key: YARN-4107
>                 URL: https://issues.apache.org/jira/browse/YARN-4107
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Xuan Gong
>            Assignee: Xuan Gong
>
> Steps to reproduce:
> 1) Run small randomwriter applications in background
> 2) rm1 is active and rm2 is standby 
> 3) Disconnect all Zks and Active RM
> 4) Check status of both RMs. Both of them are in active state



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)