You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Steve Suh (Jira)" <ji...@apache.org> on 2021/07/20 17:56:00 UTC

[jira] [Comment Edited] (YARN-10857) YarnClient Caching Addresses

    [ https://issues.apache.org/jira/browse/YARN-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17384426#comment-17384426 ] 

Steve Suh edited comment on YARN-10857 at 7/20/21, 5:55 PM:
------------------------------------------------------------

InetSocketAddress address objects are created at the initialization of the YarnClient using the hosts defined for the rm’s (in this case rm1 and rm2).  If entries for the hosts do not exist in /etc/hosts (or resolvable by dns) during this initialization, then an unresolvable InetSocketAddress object will be created for them and will be passed to the Connection to use. The InetSocketAddress object is passed to and reused by the IPC [Client.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java]

As can be seen here, the InetSocketAddress is only checked if it's an unresolved address. It does not attempt to re-resolve the cached address and throws and error.
 [https://github.com/apache/hadoop/blob/de41ce8a16434aee13f705a9e3666f29e8ec8cb3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L1602-L1608]
{code:java}
    if (address.isUnresolved()) {
      throw NetUtils.wrapException(address.getHostName(),
          address.getPort(),
          null,
          0,
          new UnknownHostException());
    }
{code}
 

A possible fix would be to change this check and include the following: _if (address.isUnresolved() *{color:#4c9aff}&& !updateAddress(){color}*)_
{code:java}
    if (address.isUnresolved() && !updateAddress()) {
      throw NetUtils.wrapException(address.getHostName(),
          address.getPort(),
          null,
          0,
          new UnknownHostException());
    }
{code}


was (Author: suhsteve):
An unresolved InetSocketAddress object is created during the initialization of the YarnClient.  This InetSocketAddress object is passed around and reused by the IPC [Client.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java]

Here the InetSocketAddress is only checked if it's an unresolved address.  It does not attempt to re-resolve the cached address and throws and error.
https://github.com/apache/hadoop/blob/de41ce8a16434aee13f705a9e3666f29e8ec8cb3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java#L1602-L1608


{code:java}
    if (address.isUnresolved()) {
      throw NetUtils.wrapException(address.getHostName(),
          address.getPort(),
          null,
          0,
          new UnknownHostException());
    }
{code}



A possible fix would be to change this check and include the following: _if (address.isUnresolved() *{color:#4C9AFF}&& !updateAddress(){color}*)_
{code:java}
    if (address.isUnresolved() && !updateAddress()) {
      throw NetUtils.wrapException(address.getHostName(),
          address.getPort(),
          null,
          0,
          new UnknownHostException());
    }
{code}


> YarnClient Caching Addresses
> ----------------------------
>
>                 Key: YARN-10857
>                 URL: https://issues.apache.org/jira/browse/YARN-10857
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: client, yarn
>            Reporter: Steve Suh
>            Assignee: Prabhu Joseph
>            Priority: Minor
>
> We have noticed that when the YarnClient is initialized and used, it is not very resilient when dns or /etc/hosts is modified in the following scenario:
> Take for instance the following (and reproducable) sequence of events that can occur on a service that instantiates and uses YarnClient. 
>   - Yarn has rm HA enabled (*yarn.resourcemanager.ha.enabled* is *true*) and there are two rms (rm1 and rm2).
>   - *yarn.client.failover-proxy-provider* is set to *org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider*
> 1)	rm2 is currently the active rm
> 2)	/etc/hosts (or dns) is missing host information for rm2
> 3)	A service is started and it initializes the YarnClient at startup.
> 4)	At some point in time after YarnClient is done initializing, /etc/hosts is updated and contains host information for rm2
> 5)	Yarn is queried, for instance calling *yarnclient.getApplications()*
> 6)	All YarnClient attempts to communicate with rm2 fail with UnknownHostExceptions, even though /etc/hosts now contains host information for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org