You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-user@hadoop.apache.org by Christian Schneider <cs...@gmail.com> on 2013/03/19 11:56:38 UTC

Parsing the JobTracker Job Logs

Hi,
how to parse the log files for our jobs? Are there already classes I can
use?

I need to display some information on a WebInterface (like the native
JobTracker does).


I am talking about this kind of files:

michaela 11:52:59
/var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
# cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
Meta VERSION="1" .
Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
USER="christian" SUBMIT_TIME="1363686587824"
JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
START_TIME="1363686587923" SPLITS="" .
MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
START_TIME="1363686594028"
TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
HTTP_PORT="50060" .
MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
written)(152299)][(FILE_READ_OPS)(FILE: Number of read
operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
\\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
\\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
heap usage
\\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"

...


Best Regards,
Christian.

Re: Parsing the JobTracker Job Logs

Posted by Christian Schneider <cs...@gmail.com>.

That's nice! Thank you very much.

No i try to get flume to work. It should collect all the files, (also the
log files from the Task Tracker).

Best Regards,
Christian.


2013/3/28 Arun C Murthy <ac...@hortonworks.com>

> Use 'rumen', it's part of Hadoop.
>
> On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:
>
> Hi,
> how to parse the log files for our jobs? Are there already classes I can
> use?
>
> I need to display some information on a WebInterface (like the native
> JobTracker does).
>
>
> I am talking about this kind of files:
>
> michaela 11:52:59
> /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
> # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
> USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="
> hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
> VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
> TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
> START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> START_TIME="1363686594028"
> TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
> HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
> HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
> COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
> Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
> read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
> written)(152299)][(FILE_READ_OPS)(FILE: Number of read
> operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
> operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
> operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
> read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
> written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
> operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
> operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
> operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
> Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
> time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
> \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
> \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
> heap usage
> \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"
>
> ...
>
>
> Best Regards,
> Christian.
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Parsing the JobTracker Job Logs

Posted by Christian Schneider <cs...@gmail.com>.

That's nice! Thank you very much.

No i try to get flume to work. It should collect all the files, (also the
log files from the Task Tracker).

Best Regards,
Christian.


2013/3/28 Arun C Murthy <ac...@hortonworks.com>

> Use 'rumen', it's part of Hadoop.
>
> On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:
>
> Hi,
> how to parse the log files for our jobs? Are there already classes I can
> use?
>
> I need to display some information on a WebInterface (like the native
> JobTracker does).
>
>
> I am talking about this kind of files:
>
> michaela 11:52:59
> /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
> # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
> USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="
> hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
> VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
> TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
> START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> START_TIME="1363686594028"
> TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
> HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
> HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
> COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
> Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
> read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
> written)(152299)][(FILE_READ_OPS)(FILE: Number of read
> operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
> operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
> operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
> read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
> written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
> operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
> operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
> operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
> Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
> time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
> \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
> \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
> heap usage
> \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"
>
> ...
>
>
> Best Regards,
> Christian.
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Parsing the JobTracker Job Logs

Posted by Christian Schneider <cs...@gmail.com>.

That's nice! Thank you very much.

No i try to get flume to work. It should collect all the files, (also the
log files from the Task Tracker).

Best Regards,
Christian.


2013/3/28 Arun C Murthy <ac...@hortonworks.com>

> Use 'rumen', it's part of Hadoop.
>
> On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:
>
> Hi,
> how to parse the log files for our jobs? Are there already classes I can
> use?
>
> I need to display some information on a WebInterface (like the native
> JobTracker does).
>
>
> I am talking about this kind of files:
>
> michaela 11:52:59
> /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
> # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
> USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="
> hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
> VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
> TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
> START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> START_TIME="1363686594028"
> TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
> HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
> HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
> COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
> Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
> read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
> written)(152299)][(FILE_READ_OPS)(FILE: Number of read
> operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
> operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
> operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
> read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
> written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
> operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
> operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
> operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
> Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
> time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
> \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
> \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
> heap usage
> \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"
>
> ...
>
>
> Best Regards,
> Christian.
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Parsing the JobTracker Job Logs

Posted by Christian Schneider <cs...@gmail.com>.

That's nice! Thank you very much.

No i try to get flume to work. It should collect all the files, (also the
log files from the Task Tracker).

Best Regards,
Christian.


2013/3/28 Arun C Murthy <ac...@hortonworks.com>

> Use 'rumen', it's part of Hadoop.
>
> On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:
>
> Hi,
> how to parse the log files for our jobs? Are there already classes I can
> use?
>
> I need to display some information on a WebInterface (like the native
> JobTracker does).
>
>
> I am talking about this kind of files:
>
> michaela 11:52:59
> /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000
> # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15"
> USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="
> hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml"
> VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923"
> TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP"
> START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> START_TIME="1363686594028"
> TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657"
> HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002"
> TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0"
> TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929"
> HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup"
> COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System
> Counters)[(FILE_BYTES_READ)(FILE: Number of bytes
> read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes
> written)(152299)][(FILE_READ_OPS)(FILE: Number of read
> operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read
> operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write
> operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes
> read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes
> written)(0)][(HDFS_READ_OPS)(HDFS: Number of read
> operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read
> operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write
> operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce
> Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU
> time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory
> \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory
> \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed
> heap usage
> \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull"
>
> ...
>
>
> Best Regards,
> Christian.
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Re: Parsing the JobTracker Job Logs

Posted by Arun C Murthy <ac...@hortonworks.com>.

Use 'rumen', it's part of Hadoop.

On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:

> Hi,
> how to parse the log files for our jobs? Are there already classes I can use?
> 
> I need to display some information on a WebInterface (like the native JobTracker does).
> 
> 
> I am talking about this kind of files:
> 
> michaela 11:52:59 /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000 # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15" USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml" VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923" TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP" START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" START_TIME="1363686594028" TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657" HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929" HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup" COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System Counters)[(FILE_BYTES_READ)(FILE: Number of bytes read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes written)(152299)][(FILE_READ_OPS)(FILE: Number of read operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes written)(0)][(HDFS_READ_OPS)(HDFS: Number of read operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed heap usage \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull" 
> 
> ...
> 
> 
> Best Regards,
> Christian.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Parsing the JobTracker Job Logs

Posted by Arun C Murthy <ac...@hortonworks.com>.

Use 'rumen', it's part of Hadoop.

On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:

> Hi,
> how to parse the log files for our jobs? Are there already classes I can use?
> 
> I need to display some information on a WebInterface (like the native JobTracker does).
> 
> 
> I am talking about this kind of files:
> 
> michaela 11:52:59 /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000 # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15" USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml" VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923" TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP" START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" START_TIME="1363686594028" TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657" HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929" HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup" COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System Counters)[(FILE_BYTES_READ)(FILE: Number of bytes read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes written)(152299)][(FILE_READ_OPS)(FILE: Number of read operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes written)(0)][(HDFS_READ_OPS)(HDFS: Number of read operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed heap usage \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull" 
> 
> ...
> 
> 
> Best Regards,
> Christian.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Parsing the JobTracker Job Logs

Posted by Arun C Murthy <ac...@hortonworks.com>.

Use 'rumen', it's part of Hadoop.

On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:

> Hi,
> how to parse the log files for our jobs? Are there already classes I can use?
> 
> I need to display some information on a WebInterface (like the native JobTracker does).
> 
> 
> I am talking about this kind of files:
> 
> michaela 11:52:59 /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000 # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15" USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml" VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923" TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP" START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" START_TIME="1363686594028" TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657" HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929" HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup" COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System Counters)[(FILE_BYTES_READ)(FILE: Number of bytes read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes written)(152299)][(FILE_READ_OPS)(FILE: Number of read operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes written)(0)][(HDFS_READ_OPS)(HDFS: Number of read operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed heap usage \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull" 
> 
> ...
> 
> 
> Best Regards,
> Christian.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/

Re: Parsing the JobTracker Job Logs

Posted by Arun C Murthy <ac...@hortonworks.com>.

Use 'rumen', it's part of Hadoop.

On Mar 19, 2013, at 3:56 AM, Christian Schneider wrote:

> Hi,
> how to parse the log files for our jobs? Are there already classes I can use?
> 
> I need to display some information on a WebInterface (like the native JobTracker does).
> 
> 
> I am talking about this kind of files:
> 
> michaela 11:52:59 /var/log/hadoop-0.20-mapreduce/history/done/michaela.ixcloud.net_1363615430691_/2013/03/19/000000 # cat job_201303181503_0864_1363686587824_christian_wordCountJob_15
> Meta VERSION="1" .
> Job JOBID="job_201303181503_0864" JOBNAME="wordCountJob_15" USER="christian" SUBMIT_TIME="1363686587824" JOBCONF="hdfs://carolin\.ixcloud\.net:8020/user/christian/\.staging/job_201303181503_0864/job\.xml" VIEW_JOB="*" MODIFY_JOB="*" JOB_QUEUE="default" .
> Job JOBID="job_201303181503_0864" JOB_PRIORITY="NORMAL" .
> Job JOBID="job_201303181503_0864" LAUNCH_TIME="1363686587923" TOTAL_MAPS="1" TOTAL_REDUCES="1" JOB_STATUS="PREP" .
> Task TASKID="task_201303181503_0864_m_000002" TASK_TYPE="SETUP" START_TIME="1363686587923" SPLITS="" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" START_TIME="1363686594028" TRACKER_NAME="tracker_anna\.ixcloud\.net:localhost/127\.0\.0\.1:34657" HTTP_PORT="50060" .
> MapAttempt TASK_TYPE="SETUP" TASKID="task_201303181503_0864_m_000002" TASK_ATTEMPT_ID="attempt_201303181503_0864_m_000002_0" TASK_STATUS="SUCCESS" FINISH_TIME="1363686595929" HOSTNAME="/default/anna\.ixcloud\.net" STATE_STRING="setup" COUNTERS="{(org\.apache\.hadoop\.mapreduce\.FileSystemCounter)(File System Counters)[(FILE_BYTES_READ)(FILE: Number of bytes read)(0)][(FILE_BYTES_WRITTEN)(FILE: Number of bytes written)(152299)][(FILE_READ_OPS)(FILE: Number of read operations)(0)][(FILE_LARGE_READ_OPS)(FILE: Number of large read operations)(0)][(FILE_WRITE_OPS)(FILE: Number of write operations)(0)][(HDFS_BYTES_READ)(HDFS: Number of bytes read)(0)][(HDFS_BYTES_WRITTEN)(HDFS: Number of bytes written)(0)][(HDFS_READ_OPS)(HDFS: Number of read operations)(0)][(HDFS_LARGE_READ_OPS)(HDFS: Number of large read operations)(0)][(HDFS_WRITE_OPS)(HDFS: Number of write operations)(1)]}{(org\.apache\.hadoop\.mapreduce\.TaskCounter)(Map-Reduce Framework)[(SPILLED_RECORDS)(Spilled Records)(0)][(CPU_MILLISECONDS)(CPU time spent \\(ms\\))(80)][(PHYSICAL_MEMORY_BYTES)(Physical memory \\(bytes\\) snapshot)(91693056)][(VIRTUAL_MEMORY_BYTES)(Virtual memory \\(bytes\\) snapshot)(575086592)][(COMMITTED_HEAP_BYTES)(Total committed heap usage \\(bytes\\))(62324736)]}nullnullnullnullnullnullnullnullnullnullnullnullnull" 
> 
> ...
> 
> 
> Best Regards,
> Christian.

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/