You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2009/05/15 10:16:45 UTC

[jira] Created: (HADOOP-5846) Log job history events to a common dump file

Log job history events to a common dump file
--------------------------------------------

                 Key: HADOOP-5846
                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
             Project: Hadoop Core
          Issue Type: New Feature
          Components: mapred
            Reporter: Amar Kamat
            Assignee: Amar Kamat


As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709761#action_12709761 ] 

Arun C Murthy commented on HADOOP-5846:
---------------------------------------

Is the proposal to add another 'log' statement in JobHistory with the lock on the JobTracker? If so, that is a slippery slope...

> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709765#action_12709765 ] 

Arun C Murthy commented on HADOOP-5846:
---------------------------------------

Haven't we had problems with job-history being written to hdfs before? Will adding another log not exacerbate it? Hence, I'm trying to understand the proposed solution...

> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709770#action_12709770 ] 

Devaraj Das commented on HADOOP-5846:
-------------------------------------

The most common history configuration is to write the files in the local disk and this jira is not changing that model.. So it will be yet another file in the local disk. Once we have fixes that will not lock the JT on history writes, creating such files on the hdfs will be not that big a deal (and long term that is the goal). 

> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709795#action_12709795 ] 

Devaraj Das commented on HADOOP-5846:
-------------------------------------

I forgot to mention that this logging would use log4j. Sorry about that.

> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709763#action_12709763 ] 

Devaraj Das commented on HADOOP-5846:
-------------------------------------

Ideally we should implement a queue where we dump the history data and a thread that processes that queue asynchronously. But that could be done in a later jira. This jira is meant to help Chukwa folks make better sense out of the history data.

> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-5846) Log job history events to a common dump file

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710331#action_12710331 ] 

Steve Loughran commented on HADOOP-5846:
----------------------------------------

Even if the stuff goes to the local filesystem today, is it not possible to run something after the work has completed (on the same machines as the log files) to push those logs into the DFS filesystem, and hence into something that can merge the logs off different machines into one continuous timeline (assuming such a timeline exists and can be determined)?



> Log job history events to a common dump file
> --------------------------------------------
>
>                 Key: HADOOP-5846
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5846
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>
> As of today all the jobhistory events are logged to separate files. It would be nice to also dump all this info into a common file so that external tools (e.g Chukwa) can harvest history info. Job configuration should also be dumped. Whether to use a same log file for history dumps and configuration dumps should be configurable (by default everything goes to one file). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.