You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Chris Nauroth (JIRA)" <ji...@apache.org> on 2017/03/15 23:22:41 UTC
[jira] [Assigned] (HADOOP-13726) Enforce that FileSystem initializes only a single instance of the requested FileSystem.

     [ https://issues.apache.org/jira/browse/HADOOP-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Nauroth reassigned HADOOP-13726:
--------------------------------------

    Assignee: Manjunath Anand

bq. I would like to work on this JIRA (if you dont mind)...

Thanks very much for taking it on!  I have assigned the issue to you.

bq. I Think the main concern of Chris (and myself) is not the operations which fail, it's those the block for a long time before failing.

Yes, that's my concern.  If opening a socket for one kind of {{FileSystem}} blocked initialization of any other kind of {{FileSystem}} in the process, then that would be a performance regression from the current implementation.

bq. I could see that if the hashcode is same for say two similar keys which are passed to computeIfAbsent concurrently then one of them waits for the other to complete, but if the hashcode of the keys are different then it doesnt block each other.

I appreciate the testing and measurement.  Unfortunately, it's difficult to build complete confidence with this kind of testing.  For example, if lock granularity is based on hash bucket within the hash table, such that 2 threads operating on 2 distinct {{FileSystem}} keys generate different hash codes, but land in the same hash bucket, then testing wouldn't expose that blocking unless we were lucky enough to get just the right hash codes.

I'd prefer that the JavaDocs have a concrete statement about locking granularity, but we don't have that.  Barring that, I think we're left with code inspection.  I read through {{computeIfAbsent}} again.  It's very tricky code, but I did notice that the mapping function can be called while holding a lock on an internal {{TreeBin}} class.

http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/8c93eb3fa1c0/src/share/classes/java/util/concurrent/ConcurrentHashMap.java#l2718

From what I understand of this code so far, I interpret that it is possible for one stalled initialization to block others that are attempting to insert to the same {{TreeBin}}.

bq. Can you or Chris Nauroth please provide an approximate initial value for the ConcurrentHashMap to be used.

I don't think it's easily predictable or consistent across different applications.  For something like a short-running Hive query, I expect relatively few {{FileSystem}} instances.  For something like Oozie, it's a long-running process that proxies activity to multiple {{FileSystem}} instances on behalf of multiple users.  Since {{UserGroupInformation}} is part of the cache key, there will be numerous unique {{FileSystem}} instances created and destroyed during the lifetime of the process, potentially causing the hash table to expand and shrink multiple times.  The current implementation just uses the default {{HashMap}} size by calling the default constructor.

Having dug into Guava's [{{LoadingCache#get}}|http://google.github.io/guava/releases/snapshot/api/docs/com/google/common/cache/LoadingCache.html#get-K-] more, this looks like it has the locking granularity that we want.

{quote}
If another call to get(K) or getUnchecked(K) is currently loading the value for key, simply waits for that thread to finish and returns its loaded value. Note that multiple threads can concurrently load values for distinct keys.
{quote}

That's exactly the behavior I was hoping to achieve.

bq. how do we pass the URI and conf from the getInternal method...

Oh no.  Now I'm stuck.  I don't have a good recommendation for this yet.  My initial reaction was to include the {{Configuration}} and {{URI}} in the {{Key}} object but omit it from {{hashCode()}} calculation.  Then, the {{CacheLoader}} could unpack what it needs from the key.  However, this would potentially cause the cache to hold on to {{Configuration}} instances much longer and bloat memory footprint.

I'll think on this more and let you know if I come up with anything.


> Enforce that FileSystem initializes only a single instance of the requested FileSystem.
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13726
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13726
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Chris Nauroth
>            Assignee: Manjunath Anand
>
> The {{FileSystem}} cache is intended to guarantee reuse of instances by multiple call sites or multiple threads.  The current implementation does provide this guarantee, but there is a brief race condition window during which multiple threads could perform redundant initialization.  If the file system implementation has expensive initialization logic, then this is wasteful.  This issue proposes to eliminate that race condition and guarantee initialization of only a single instance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org