You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Daryn Sharp (JIRA)" <ji...@apache.org> on 2012/06/07 17:20:23 UTC

[jira] [Commented] (HADOOP-8490) Add Configuration to FileSystem cache key

    [ https://issues.apache.org/jira/browse/HADOOP-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291068#comment-13291068 ] 

Daryn Sharp commented on HADOOP-8490:
-------------------------------------

Not honoring the given conf causes the obvious problem of not being able to tweak values.

It's also causing problem for the NM.  When an app is done, it should be able to call {{FileSystem.closeAllForUGI}} just like the JT does.  Unfortunately that may pull the rug out from under another app for that user also running on the NM.  It also means that multiple jobs for the same user are erroneously using the first job's conf.  Both are probably latent issues in the JT but go unnoticed or are masked by retries.

Ideally the {{hashCode}} or {{identityHashCode}} would be added to the cache key.  A key/value equivalence test should not be performed because seemingly identical confs (ex. cloned from each other) would initially appear the same but may later change.  One potential issue is cloned confs that really should be the same -- ex. yarn often creates a {{YarnConfiguration(conf)}}.  This won't be a problem if the conversion is done once and stashed.  If it's done on the fly multiple times, then it does present a problem.  Arguably that would be a bug but it would be difficult to fix in a timely manner.

So an alternative is to add a key to the conf (ex. {{fs.cache-id}}) that can be used in the fs cache key.  This would allow partitioning of the cache, albeit imperfectly, that would account for cloned confs that should be treated the same.  The onus is placed upon the caller to explicitly change the key when needed, but it would be more transparent for existing code.

I'll wait for comments before preceding.
                
> Add Configuration to FileSystem cache key
> -----------------------------------------
>
>                 Key: HADOOP-8490
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8490
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.23.0, 0.24.0, 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>
> The {{FileSystem#get(URI, Configuration}} does not take the given {{Configuration}} into consideration before returning an existing fs instance from the cache with a possibly different conf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira