You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2014/07/30 23:54:41 UTC

[jira] [Resolved] (MAPREDUCE-1914) TrackerDistributedCacheManager never cleans its input directories

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Allen Wittenauer resolved MAPREDUCE-1914.
-----------------------------------------

    Resolution: Fixed

> TrackerDistributedCacheManager never cleans its input directories
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-1914
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1914
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Dick King
>            Assignee: Dick King
>         Attachments: MAPREDUCE-1914--2010-07-30--1336.patch
>
>
> When we localize a file into a node's cache, it's installed in a directory whose subroot is a random {{long}} .  These {{long}} s all sit in a single flat directory [per disk, per cluster node].  When the cached file is no longer needed, its reference count becomes zero in a tracking data structure.  The file then becomes eligible for deletion when the total amount of space occupied by cached files exceeds 10G [by default] or the total number of such files exceeds 10K.
> However, when we delete a cached file, we don't delete the directory that contains it; this importantly includes the elements of the flat directory, which then accumulate until they reach a system limit, 32K in some cases, and then the node stops working.
> We need to delete the flat directory when we delete the localized cache file it contains.



--
This message was sent by Atlassian JIRA
(v6.2#6252)