You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2013/08/19 17:32:47 UTC

[jira] [Commented] (YARN-194) Log handling in case of NM restart.

    [ https://issues.apache.org/jira/browse/YARN-194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13743902#comment-13743902 ] 

Jason Lowe commented on YARN-194:
---------------------------------

The NM waits not only for the container to complete but for the entire application to complete -- see YARN-219.  Holding long-lived leases on many files in HDFS puts a lot of load on the namenode.

It also cannot append "on the fly" since all the logs for all containers for an application on the node are in a single file in HDFS with the data for each log being contiguous within that file.  Adding the ability to append to multiple log streams simultaneously is not possible in the current aggregated log format.

It would be nice to have some mechanism to get the NM to clean up logs, as currently each time the NM restarts log files are being leaked.  This has been fixed for container local directories and the distributed cache via YARN-71, but logs have been ignored.  Seems like we should be consistent about these two.  If the application is still running, isn't YARN-71 already deleting the app's current working directory and distcache files out from underneath it?
                
> Log handling in case of NM restart.
> -----------------------------------
>
>                 Key: YARN-194
>                 URL: https://issues.apache.org/jira/browse/YARN-194
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 0.23.4
>            Reporter: Siddharth Seth
>            Assignee: Omkar Vinit Joshi
>
> Currently, if an NM restarts - existing logs will be left around till they're manually cleaned up. The NM could be improved to handle these files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira