You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org> on 2015/10/31 01:07:27 UTC

[jira] [Commented] (HADOOP-12412) Concurrency in FileSystem$Cache is very broken

    [ https://issues.apache.org/jira/browse/HADOOP-12412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983647#comment-14983647 ] 

Vinod Kumar Vavilapalli commented on HADOOP-12412:
--------------------------------------------------

bq. At best, this leads to potentially expensive wasted work. At worst, as is the case for Spark, it can lead to deadlocks/livelocks, especially when the same configuration object is passed into both calls.
Can you describe this more and how the current code causes this?

IAC, the patch doesn't apply anymore, needs to fix code-formatting, and more testing etc.

Given this is still in progress, I'd move this out into 2.7.3 as I am considering a 2.7.2 RC this weekend. Let me know if you have comments/concerns etc. Thanks.

> Concurrency in FileSystem$Cache is very broken
> ----------------------------------------------
>
>                 Key: HADOOP-12412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12412
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.7.0
>            Reporter: Michael Harris
>            Assignee: Michael Harris
>            Priority: Critical
>         Attachments: HADOOP-12412.patch, HADOOP-12412.patch
>
>
> The FileSystem cache uses a mild amount of concurrency to protect the cache itself, but does nothing to prevent multiple of the same filesystem from being constructed and initialized simultaneously.  At best, this leads to potentially expensive wasted work.  At worst, as is the case for Spark, it can lead to deadlocks/livelocks, especially when the same configuration object is passed into both calls.  This should be refactored to use a results cache approach (reference Java Concurrency in Practice chapter 5 section 6 for an example of how to do this correctly), which will be both higher-performance and safer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)