You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Marouane RAJI (JIRA)" <ji...@apache.org> on 2019/04/23 16:09:00 UTC

[jira] [Created] (YARN-9506) Node Managers fail to update cached IP entries of Resource Managers

Marouane RAJI created YARN-9506:
-----------------------------------

             Summary: Node Managers fail to update cached IP entries of Resource Managers 
                 Key: YARN-9506
                 URL: https://issues.apache.org/jira/browse/YARN-9506
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager
    Affects Versions: 2.7.1
            Reporter: Marouane RAJI
         Attachments: NM_logs.txt

Hi,

We are running a Yarn Cluster (for Samza Jobs) on AWS. We are running it in HA mode, with yarn.resourcemanager.ha.automatic-failover.enabled= true

To reproduce the issue : 
 # Have a running cluster with 2 NodeManagers and 2 Resource Managers in HA mode, with fail-over enabled.
 ** These Resource Managers need to have DNS entries defined, and set in the config:
 *** ex: yarnrm1.me.local and yarnrm2.me.local
 # stop the active resource manager (yarnrm1.me.local), and retire its instance. (Node Managers will fallback to the standby yarnrm2.me.local)
 # provision a new resource manager with a new IP. Make sure the DNS entry yarnrm1.me.local is assigned to it.
 # stop the new active resource manager (yarnrm2.me.local).
 # Check the logs of NodeManagers failing to access the newly provisioned Resource Manager, and trying to access it through the old IP.

I can provide config files, yarn-site and core-site if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org