You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2009/12/30 02:30:29 UTC

[jira] Commented: (MAPREDUCE-1345) JobTracker is slowed down because it forks subprocesses to do a df command

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795215#action_12795215 ] 

dhruba borthakur commented on MAPREDUCE-1345:
---------------------------------------------

This problem becomes acute when the JT is configured with more than 24GB of heap space and a new job arrives once every 5 seconds or so.

On most unix-y systems, one can scan /proc/diskstats to determine the amount of disk space used for each pf the local dirs.

> JobTracker is slowed down because it forks subprocesses to do a df command
> --------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1345
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1345
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: Scott Chen
>
> The JobTracker periodically does a df on the local directories. It forks a shell a shell to run a df command. The creation of the separate process is very slow because the process address space is copied by the OS on every subprocess creation. This becomes worse when the JT is configured to use a large heap space. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.