You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Travis Crawford (JIRA)" <ji...@apache.org> on 2011/06/13 22:22:51 UTC

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated MAPREDUCE-2589:
---------------------------------------

    Attachment: cleanup_userlogs.py

We see this on our clusters too.

Attached is a script that I run from cron to cleanup old userlogs. The general idea is setting some high water mark for userlog disk space, and when passed, delete logs until passing some low water mark. Logs for running jobs are excluded from cleanup, which has infrequently caused issues but in general are worth excluding.

Posting as an example of what the replacement might look like (as an internal periodic task, of course). Also, not sure how the nextgen stuff deals with cleanup.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira