You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Benoy Antony (JIRA)" <ji...@apache.org> on 2012/07/25 20:02:34 UTC

[jira] [Created] (MAPREDUCE-4481) User Log Retention across TT restarts

Benoy Antony created MAPREDUCE-4481:
---------------------------------------

             Summary: User Log Retention across TT restarts
                 Key: MAPREDUCE-4481
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: tasktracker
    Affects Versions: 1.0.0
            Reporter: Benoy Antony
            Assignee: Benoy Antony
            Priority: Minor


The tasktrackers cleanup the userlog directory when they restart.
This happens independent of value of mapred.userlog.retain.hours.

The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J updated MAPREDUCE-4481:
-------------------------------

    Affects Version/s: 0.22.0

Thanks very much Benoy!

Reopen reason: This should still affect 0.22 due to MAPREDUCE-1213 being in it. Lets try to fix it for 0.22.x.
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426894#comment-13426894 ] 

Benoy Antony commented on MAPREDUCE-4481:
-----------------------------------------

Good point related to porting MAPREDUCE-2415 to 0.22.
Another related question will be porting MAPREDUCE-1213 to 1.1 
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423603#comment-13423603 ] 

Benoy Antony commented on MAPREDUCE-4481:
-----------------------------------------

Sure, I will add the steps to reproduce.
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422814#comment-13422814 ] 

Arun C Murthy commented on MAPREDUCE-4481:
------------------------------------------

What is the use case?
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benoy Antony resolved MAPREDUCE-4481.
-------------------------------------

    Resolution: Not A Problem
    
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426757#comment-13426757 ] 

Harsh J commented on MAPREDUCE-4481:
------------------------------------

Ah yes, my bad. Would 0.22 benefit from MAPREDUCE-2415 though? If its not needed, we can probably close this out again.
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benoy Antony updated MAPREDUCE-4481:
------------------------------------

    Affects Version/s:     (was: 1.0.0)
    
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422806#comment-13422806 ] 

Harsh J commented on MAPREDUCE-4481:
------------------------------------

No, MAPREDUCE-2415 should not have broken this:

{code}
   public void cleanupStorage() throws IOException {
-    this.fConf.deleteLocalFiles();
+    this.fConf.deleteLocalFiles(SUBDIR);
+    this.fConf.deleteLocalFiles(TT_PRIVATE_DIR);
+    this.fConf.deleteLocalFiles(TT_LOG_TMP_DIR);
   }
{code}

As we can see there, it now deletes specific sub-dirs instead of all of them, when cleaning the local dirs.
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J reassigned MAPREDUCE-4481:
----------------------------------

    Assignee:     (was: Benoy Antony)
    
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422822#comment-13422822 ] 

Harsh J commented on MAPREDUCE-4481:
------------------------------------

Arun - Isn't this a bug? Restarts of a service should not break the log retention guarantee?
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J reopened MAPREDUCE-4481:
--------------------------------

    
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Harsh J (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422830#comment-13422830 ] 

Harsh J commented on MAPREDUCE-4481:
------------------------------------

I could not reproduce this on 1.0.3. I restarted the TT, re-initialized it (via JT restart), but the MR local directory userlogs/ still exist and continue to be symlinked inside the logs dir (and nowhere in code can I find a full clean delete of TaskLog.USERLOGS_DIR_NAME presently). Can you post some reproduce steps for this?
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426756#comment-13426756 ] 

Benoy Antony commented on MAPREDUCE-4481:
-----------------------------------------

This will not impact 0.22.  MAPREDUCE -2415 was not ported to 0.22 . So userlogs directory will not be under the scratch directories. 

                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.22.0
>            Reporter: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4481) User Log Retention across TT restarts

Posted by "Benoy Antony (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13425927#comment-13425927 ] 

Benoy Antony commented on MAPREDUCE-4481:
-----------------------------------------

This issue occurs only in those distributions where MAPREDUCE-2415 is applied as well as MRAsyncDiskService is used to cleanup the volumes during TT startup.
This is not applicable to 1.0 or 1.1 since MRAsyncDiskService is not present in those.
I don't think , it is applicable to trunk. 
                
> User Log Retention across TT restarts
> -------------------------------------
>
>                 Key: MAPREDUCE-4481
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4481
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 1.0.0
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>            Priority: Minor
>
> The tasktrackers cleanup the userlog directory when they restart.
> This happens independent of value of mapred.userlog.retain.hours.
> The feature is to add a configurable feature to respect mapred.userlog.retain.hours across TT restarts

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira