You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2016/06/02 15:41:59 UTC

[jira] [Updated] (HADOOP-10048) LocalDirAllocator should avoid holding locks while accessing the filesystem

     [ https://issues.apache.org/jira/browse/HADOOP-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated HADOOP-10048:
--------------------------------
    Attachment: HADOOP-10048.006.patch

bq. Otherwise, accessing of disks could be aggregated on particular disk. Thoughts?

I'm not worried about the dirNumLastAccessed being somewhat random -- it already is random if someone needs a write location without specifying a size.  What is more concerning is the thundering herd problem where a bunch of threads all need write locations with a size at the same time.  All or most of the threads could end up theoretically clustering on the same disk which is less than ideal.  Attaching a new patch that uses an AtomicInteger to make sure that simultaneous threads won't get the same starting point when searching the directories.

This approach doesn't completely solve the clustering issue when one or more disks gets full enough to not satisfy the requests.  An alternative approach would be to use a random starting location like is done when the size is not specified.  I went with this approach since it is closer to the original semantics without adding the undesired locking necessary to guarantee it.

> LocalDirAllocator should avoid holding locks while accessing the filesystem
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10048
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10048
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: HADOOP-10048.003.patch, HADOOP-10048.004.patch, HADOOP-10048.005.patch, HADOOP-10048.006.patch, HADOOP-10048.patch, HADOOP-10048.trunk.patch
>
>
> As noted in MAPREDUCE-5584 and HADOOP-7016, LocalDirAllocator can be a bottleneck for multithreaded setups like the ShuffleHandler.  We should consider moving to a lockless design or minimizing the critical sections to a very small amount of time that does not involve I/O operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org