You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Created) (JIRA)" <ji...@apache.org> on 2012/02/09 14:09:59 UTC

[jira] [Created] (HADOOP-8047) CachedDNSToSwitchMapping caches negative results forever

CachedDNSToSwitchMapping caches negative results forever
--------------------------------------------------------

                 Key: HADOOP-8047
                 URL: https://issues.apache.org/jira/browse/HADOOP-8047
             Project: Hadoop Common
          Issue Type: Bug
          Components: util
    Affects Versions: 1.0.0, 0.23.0, 0.24.0
            Reporter: Steve Loughran
            Priority: Trivial


This is very minor, just worth filing in JIRA unless someone wants to rethink topology caching for a dynamic world.

# The CachedDNSToSwitchMapping caches the results from all relayed DNS queries.
# The DNS script mapper returns the default rack for all unknown entries (or when the script fails)
# The Cache stores this in its map and never re-resolves it.

As a result, if a node is added to a live cluster that the existing script cannot resolve, then it won't get assigned to a rack unless the script is updated before the rack map is resolved. 

This isn't usually that important, it just means "update your scripts before adding new racks". Perhaps there should be a page on that activity, "runbook and checklist for adding new servers and racks".

Where it would matter if anyone started playing with dynamic topologies, but in that situation the cached mapping itself would become the liability, as it assumes that servers never switch switches in a live system: the topology is static for existing nodes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8047) CachedDNSToSwitchMapping caches negative results forever

Posted by "Daryn Sharp (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204554#comment-13204554 ] 

Daryn Sharp commented on HADOOP-8047:
-------------------------------------

This is actually a general problem if the cache is in a persistent daemon like the namenode.  Transient DNS problems can cause false entries in the negative cache.
                
> CachedDNSToSwitchMapping caches negative results forever
> --------------------------------------------------------
>
>                 Key: HADOOP-8047
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8047
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.23.0, 0.24.0, 1.0.0
>            Reporter: Steve Loughran
>            Priority: Trivial
>
> This is very minor, just worth filing in JIRA unless someone wants to rethink topology caching for a dynamic world.
> # The CachedDNSToSwitchMapping caches the results from all relayed DNS queries.
> # The DNS script mapper returns the default rack for all unknown entries (or when the script fails)
> # The Cache stores this in its map and never re-resolves it.
> As a result, if a node is added to a live cluster that the existing script cannot resolve, then it won't get assigned to a rack unless the script is updated before the rack map is resolved. 
> This isn't usually that important, it just means "update your scripts before adding new racks". Perhaps there should be a page on that activity, "runbook and checklist for adding new servers and racks".
> Where it would matter if anyone started playing with dynamic topologies, but in that situation the cached mapping itself would become the liability, as it assumes that servers never switch switches in a live system: the topology is static for existing nodes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HADOOP-8047) CachedDNSToSwitchMapping caches negative results forever

Posted by "Steve Loughran (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213708#comment-13213708 ] 

Steve Loughran commented on HADOOP-8047:
----------------------------------------

you can set the JVM up not to cache -ve DNS entries; it used to do it forever (bad, bad sun), but this is a separate issue.

what the switch mapping is doing is caching the mapping to a specific switch, so if the topology script is updated as new machines are added, if somehow those new machines report in before the script is updated, the default rack settings get cached.

for extra fun, every switch mapping is wrapped by a (completely gratuitious) cache. I say gratuitous as the script mapper does its own caching, and other implementations are likely to be more dynamic (will need to check with people who have written their own mapper, obviously). Because of that wrapping, no dynamic topology scripts will ever have their changes picked up
                
> CachedDNSToSwitchMapping caches negative results forever
> --------------------------------------------------------
>
>                 Key: HADOOP-8047
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8047
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.23.0, 0.24.0, 1.0.0
>            Reporter: Steve Loughran
>            Priority: Trivial
>
> This is very minor, just worth filing in JIRA unless someone wants to rethink topology caching for a dynamic world.
> # The CachedDNSToSwitchMapping caches the results from all relayed DNS queries.
> # The DNS script mapper returns the default rack for all unknown entries (or when the script fails)
> # The Cache stores this in its map and never re-resolves it.
> As a result, if a node is added to a live cluster that the existing script cannot resolve, then it won't get assigned to a rack unless the script is updated before the rack map is resolved. 
> This isn't usually that important, it just means "update your scripts before adding new racks". Perhaps there should be a page on that activity, "runbook and checklist for adding new servers and racks".
> Where it would matter if anyone started playing with dynamic topologies, but in that situation the cached mapping itself would become the liability, as it assumes that servers never switch switches in a live system: the topology is static for existing nodes. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira