You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2019/02/15 11:23:00 UTC

[jira] [Commented] (HADOOP-16114) NetUtils#canonicalizeHost gives different value for same host

    [ https://issues.apache.org/jira/browse/HADOOP-16114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769207#comment-16769207 ] 

Steve Loughran commented on HADOOP-16114:
-----------------------------------------

I see: it guarantees that whichever hostname went into the cache is the one used in both threads. Makes sense, and its done elsewhere.

Fancy submitting a patch? I'm not sure if we can do an easy test for this, so we'll have to rely on review

> NetUtils#canonicalizeHost gives different value for same host
> -------------------------------------------------------------
>
>                 Key: HADOOP-16114
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16114
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: net
>    Affects Versions: 2.7.6, 3.1.2
>            Reporter: Praveen Krishna
>            Priority: Minor
>
> In NetUtils#canonicalizeHost uses ConcurrentHashMap#putIfAbsent to add an entry to the cache
> {code:java}
>   private static String canonicalizeHost(String host) {
>     // check if the host has already been canonicalized
>     String fqHost = canonicalizedHostCache.get(host);
>     if (fqHost == null) {
>       try {
>         fqHost = SecurityUtil.getByName(host).getHostName();
>         // slight race condition, but won't hurt
>         canonicalizedHostCache.putIfAbsent(host, fqHost);
>       } catch (UnknownHostException e) {
>         fqHost = host;
>       }
>     }
>     return fqHost;
> }
> {code}
>  
> If two different threads were invoking this method for the first time (so the cache is empty) and if SecurityUtil#getByName()#getHostName gives two different value for the same host , only one fqHost would be added in the cache and an invalid fqHost would be given to one of the thread which might cause some APIs to fail for the first time `FileSystem#checkPath` even if the path is in the given file system. It might be better if we modify the above method to this
>  
> {code:java}
>   private static String canonicalizeHost(String host) {
>     // check if the host has already been canonicalized
>     String fqHost = canonicalizedHostCache.get(host);
>     if (fqHost == null) {
>       try {
>         fqHost = SecurityUtil.getByName(host).getHostName();
>         // slight race condition, but won't hurt
>         canonicalizedHostCache.putIfAbsent(host, fqHost);
>         fqHost = canonicalizedHostCache.get(host);
>       } catch (UnknownHostException e) {
>         fqHost = host;
>       }
>     }
>     return fqHost;
> }
> {code}
>  
> So even if other thread get a different host name it will be updated to the cached value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org