You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Vinod K V (JIRA)" <ji...@apache.org> on 2009/11/02 10:11:59 UTC

[jira] Updated: (MAPREDUCE-1100) User's task-logs filling up local disks on the TaskTrackers

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated MAPREDUCE-1100:
---------------------------------

    Attachment: MAPREDUCE-1100-20091102.txt

Attaching a first patch.

Introducing the following configuration items:
 - Job Configuration:
    -- {{JobContext.MAP_USERLOG_LIMIT}} : Per task limit on how much each log file can grow to. Used by {{killRunningTasksOverLimit()}} for killing tasks that write excessive logging.
    -- {{JobContext.REDUCE_USERLOG_LIMIT}} : Same as above for reduces.
    -- {{JobContext.MAP_USERLOG_RETAIN_SIZE}} : Per task configuration of how much tail of the each log file has to be retained. Each task-log file is truncated to this amount after the task finishes. Used by {{truncateLogsOfFinishedTasks()}}
    -- {{JobContext.REDUCE_USERLOG_RETAIN_SIZE}} : Same as above for reduces.

 - TT configuration
    -- {{TTConfig.TT_USERLOG_RETAIN_HOURS}} : TT configuraton of how long logs of each finished task has to be retained on this TT. Used by {{retireOldLogs()}} to cleanup very old logs.
    -- {{TTConfig.TT_USERLOG_CUMULATIVE_LIMIT}} : TT configuration limiting the total usage of log files across all tasks. If the total usage grows beyond this limit, {{removeOldFilesToControlCumulativeUsage()}} removes old log files irrespective of their age w.r.t {{TTConfig.TT_USERLOG_RETAIN_HOURS}}.

Moved clean-up of task-logs from child into TaskLogsMonitor which does the following:
{code}
while(true) {

  retireOldLogs(); // remove very old logs

  truncateLogsOfFinishedTasks(); // truncate finished tasks' logs. Also set no-writable permissions.

  killRunningTasksOverLimit(); // kill tasks going over per-task per-file limit

  removeOldFilesToControlCumulativeUsage(); // remove very old logs if total usage is alarming irrespective of retain.hours
}  
{code}

> User's task-logs filling up local disks on the TaskTrackers
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-1100
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1100
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Vinod K V
>            Assignee: Vinod K V
>         Attachments: MAPREDUCE-1100-20091102.txt
>
>
> Some user's jobs are filling up TT disks by outrageous logging. mapreduce.task.userlog.limit.kb is not enabled on the cluster. Disks are getting filled up before task-log cleanup via mapred.task.userlog.retain.hours can kick in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.