You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sherry Chen (JIRA)" <ji...@apache.org> on 2011/06/13 20:50:51 UTC

[jira] [Created] (MAPREDUCE-2589) TaskTracker not purging userlog directories

TaskTracker not purging userlog directories
-------------------------------------------

                 Key: MAPREDUCE-2589
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.20.205.0
         Environment: 0.20.205
            Reporter: Sherry Chen
            Assignee: Sherry Chen
            Priority: Minor


UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2589:
-----------------------------------

    Attachment: MAPREDUCE-2589_1.patch

Fixed typo.
Ant test passed by manually testing in a single node cluster.


> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2589:
-----------------------------------

    Resolution: Won't Fix
        Status: Resolved  (was: Patch Available)

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated MAPREDUCE-2589:
-------------------------------------

    Fix Version/s: 0.20.205.0

The patch looks good.

One minor nit:

 I think the variable name below:

{quote}
  long logRetainiMillSec = DEFAULT_USER_LOG_RETAIN_MAX_HOURS * 60 * 60 * 1000;
{quote}
was supposed to be logRetainMilliSec? (spelling mistake?)

Also, can you please post the ant test results on the jira? 

THe patch lacks unit tests,  have you already verified the fix on a small cluster?

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070295#comment-13070295 ] 

Mahadev konar commented on MAPREDUCE-2589:
------------------------------------------

Sherry,
 Sorry I looked at it again and I think its good if we make the DEFAULT_USER_LOG_RETAIN_MAX_HOURS configurable. Also, -1 should disable the feature. I think its important to be able to switch off misbehaving configuration. 

Also, why do we need a call to get jobs that are still running? I thought the call was made only on restart/reinit? We should be able to clean old user logs without calling jc.jobstocomplete? I think we should avoid adding a dependency on calling jobtracker client methods in the tasktracker itself. what do you think?

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064759#comment-13064759 ] 

Hadoop QA commented on MAPREDUCE-2589:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12482575/MAPREDUCE-2589.patch
  against trunk revision 1145889.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/464//console

This message is automatically generated.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2589:
-----------------------------------

    Release Note: Won't Fix

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2589:
-----------------------------------

    Status: Patch Available  (was: Open)

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sherry Chen updated MAPREDUCE-2589:
-----------------------------------

    Attachment: MAPREDUCE-2589.patch

Attached fix will delete leftover user logs which are last modified 7 days ago
and are not for any running jobs when TaskTracker restarts. 
UserLogCleaner still
takes care of normal user logs cleanup.
DEFAULT_USER_LOG_RETAIN_HOURS is 1 day, so I added new
DEFAULT_USER_LOG_RETAIN_MAX_HOURS as 7 days.

I would like to have a separate JIRA to handle cleanup old userlogs based on
userlog disk space water mark since which may deal with job configuration settings
change.


> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070916#comment-13070916 ] 

Sherry Chen commented on MAPREDUCE-2589:
----------------------------------------

Mahadev,

Per Bharath (Mundlapudi), his change in MAPREDUCE-2415 distributes TaskTracker userlogs onto multiple
disks, it went into 0.20.204.
Changes in MAPREDUCE-2415 strengthened TaskTracker reliability with regard to disk failures, also
have addOldUserLogsForDeletion() to add the job log directories for deletion with default retain hours.

I tested with MAPREDUCE-2415 change in, all old job log directories will be deleted after default retain hours.

I think my change here is not necessary, I would like to withdraw it. Is it OK with you?


> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069288#comment-13069288 ] 

Hadoop QA commented on MAPREDUCE-2589:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12487372/MAPREDUCE-2589_1.patch
  against trunk revision 1149323.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/493//console

This message is automatically generated.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated MAPREDUCE-2589:
---------------------------------------

    Attachment: cleanup_userlogs.py

We see this on our clusters too.

Attached is a script that I run from cron to cleanup old userlogs. The general idea is setting some high water mark for userlog disk space, and when passed, delete logs until passing some low water mark. Logs for running jobs are excluded from cleanup, which has infrequently caused issues but in general are worth excluding.

Posting as an example of what the replacement might look like (as an internal periodic task, of course). Also, not sure how the nextgen stuff deals with cleanup.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Sherry Chen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062159#comment-13062159 ] 

Sherry Chen commented on MAPREDUCE-2589:
----------------------------------------

Aaron,
I believe MAPREDUCE-1100 and related JIRAs change how task-logs work. We may not need to have this patch for trunk at this point.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049310#comment-13049310 ] 

Aaron T. Myers commented on MAPREDUCE-2589:
-------------------------------------------

Hi Sherry, does this issue not also affect trunk? If so, would you mind preparing a trunk patch as well?

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070918#comment-13070918 ] 

Mahadev konar commented on MAPREDUCE-2589:
------------------------------------------

Sherry,
 That sounds good to me! Please resolve this as WONT FIX.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, MAPREDUCE-2589_1.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-2589) TaskTracker not purging userlog directories

Posted by "Devaraj K (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067516#comment-13067516 ] 

Devaraj K commented on MAPREDUCE-2589:
--------------------------------------

One improvement can be done in the patch, now for every file in the user log directory it is getting the jobs which are to be completed every time and checking. Instead of this it can get the jobs list once and can check for all the files in the user log directory whether it belongs to running job or not.

> TaskTracker not purging userlog directories
> -------------------------------------------
>
>                 Key: MAPREDUCE-2589
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2589
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>         Environment: 0.20.205
>            Reporter: Sherry Chen
>            Assignee: Sherry Chen
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: MAPREDUCE-2589.patch, cleanup_userlogs.py
>
>
> UserLogCleaner is not robust. Leftover userlogs after a restart sometimes have to be manually
> cleaned. Things can accumulate over a period of time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira