You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2011/02/12 10:08:57 UTC

[jira] Commented: (MAPREDUCE-437) JobTracker may need to close its filesystem when being terminated

    [ https://issues.apache.org/jira/browse/MAPREDUCE-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993863#comment-12993863 ] 

Steve Loughran commented on MAPREDUCE-437:
------------------------------------------

Reviewing the code in trunk, the problem is a bit more serious and relates to what happens when a cached FS instance is closed: everyone who has a reference to that instance cannot use the filesystem. 

this does not normally surface in production as the JT runs in its own VM. It does exist in MiniMR clusters, in testing, but hasn't shown up because nobody other than me has tried to shut down an FS instance while the JT is still live.

Proposed actions
 1-rename this issue to be more explicit: JT must ask for a new FS instance and close it when terminated.
 2-add a test to verify that a miniMR cluster will fail if you get the same instance and close it
 3-have the JT get a new instance on startup/going live and verify that test 2 now passes
 4-have the JT close its filesystem on shutdown, set its local reference to null
I can't think of an easy way to test #4 unless there is a method to get the JT filesystem reference

> JobTracker may need to close its filesystem when being terminated
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-437
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-437
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Steve Loughran
>            Priority: Minor
>
> This is something I've been experimenting with HADOOP-3268; I'm not sure what the right action is here.
> -currently, the JobTracker does not close() its filesystem when it is shut down. This will cause it to leak filesystem references if JobTrackers are started and stopped in the same process.
> -The TestMRServerPorts test explicitly closes the filesystem
>         jt.fs.close();
>         jt.stopTracker();
> -If you move the close() operation into the stopTracker()/terminate logic, the filesystem gets cleaned up, but 
> TestRackAwareTaskPlacement and TestMultipleLevelCaching fail with a FilesystemClosed error (stack traces to follow)
> Should the JobTracker close its filesystem whenever it is terminated? If so, there are some tests that need to be reworked slightly to not expect the fileystem to be live after the jobtracker is taken down.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira