You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2013/09/14 04:23:51 UTC

[jira] [Commented] (MAPREDUCE-5508) JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13767308#comment-13767308 ] 

Sandy Ryza commented on MAPREDUCE-5508:
---------------------------------------

Have you tested this fix?  I took a deeper look into this and it doesn't appear that tempDirFs and fs are ever even ending up equal because tempDirFs is created with the wrong UGI.

The deeper problem to me is that we are creating a new UGI, which can have a new subject, which can create a new entry in the FS cache, every time CleanupQueue#deletePath is called with a null UGI.  This occurs here:
{code}
        CleanupQueue.getInstance().addToQueue(
            new PathDeletionContext(tempDir, conf));
{code}

A better fix would be to avoid this, either by having CleanupQueue hold a UGI of the login user for use in these situations or to avoid the doAs entirely when the given UGI is null.
                
> JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
> ------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5508
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5508
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions: 1-win, 1.2.1
>            Reporter: Xi Fang
>            Assignee: Xi Fang
>            Priority: Critical
>         Attachments: MAPREDUCE-5508.patch
>
>
> MAPREDUCE-5351 fixed a memory leak problem but introducing another filesystem object (see "tempDirFs") that is not properly released.
> {code} JobInProgress#cleanupJob()
>   void cleanupJob() {
> ...
>           tempDirFs = jobTempDirPath.getFileSystem(conf);
>           CleanupQueue.getInstance().addToQueue(
>               new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId));
> ...
>  if (tempDirFs != fs) {
>       try {
>         fs.close();
>       } catch (IOException ie) {
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira