You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by bharath vissapragada <bh...@gmail.com> on 2012/10/30 02:48:38 UTC

Tools for extracting data from hadoop logs

Hi list,

Are the any tools for parsing and extracting data from Hadoop's Job Logs? I
want to do stuff like ..

1. Getting run time of each map/reduce task
2. Total map/reduce tasks ran on a particular node in that job  and some
similar stuff

Any suggestions?

Thanks

Re: Tools for extracting data from hadoop logs

Posted by Raj Vishwanathan <ra...@yahoo.com>.
Take a look at 

https://github.com/rajvish/hadoop-summary




>________________________________
> From: bharath vissapragada <bh...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, October 29, 2012 10:03 PM
>Subject: Re: Tools for extracting data from hadoop logs
> 
>
>Hi Binglin,
>
>
>Great scripts ..Thanks for sharing :D
>
>
>Regards,
>
>
>On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:
>
>Hi,
>>
>>
>>I think you want to analyze hadoop job logs in jobtracker history folder? These logs are in a centralized folder and don't need tools like flume or scribe to gather them.
>>I used to write a simple python script to parse those log files, and generate csv/json reports, basically you can use it to get execution time, counter, status of job, taks, attempts, maybe you can modify it to meet you needs.
>>
>>
>>Thanks,
>>Binglin
>>
>>
>>
>>
>>On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <bh...@gmail.com> wrote:
>>
>>Hi list,
>>>
>>>
>>>Are the any tools for parsing and extracting data from Hadoop's Job Logs? I want to do stuff like .. 
>>>
>>>
>>>1. Getting run time of each map/reduce task
>>>2. Total map/reduce tasks ran on a particular node in that job  and some similar stuff
>>>
>>>
>>>Any suggestions?
>>>
>>>
>>>
>>>Thanks
>>
>
>
>
>-- 
>Regards,
>Bharath .V
>w:http://researchweb.iiit.ac.in/~bharath.v
>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Raj Vishwanathan <ra...@yahoo.com>.
Take a look at 

https://github.com/rajvish/hadoop-summary




>________________________________
> From: bharath vissapragada <bh...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, October 29, 2012 10:03 PM
>Subject: Re: Tools for extracting data from hadoop logs
> 
>
>Hi Binglin,
>
>
>Great scripts ..Thanks for sharing :D
>
>
>Regards,
>
>
>On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:
>
>Hi,
>>
>>
>>I think you want to analyze hadoop job logs in jobtracker history folder? These logs are in a centralized folder and don't need tools like flume or scribe to gather them.
>>I used to write a simple python script to parse those log files, and generate csv/json reports, basically you can use it to get execution time, counter, status of job, taks, attempts, maybe you can modify it to meet you needs.
>>
>>
>>Thanks,
>>Binglin
>>
>>
>>
>>
>>On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <bh...@gmail.com> wrote:
>>
>>Hi list,
>>>
>>>
>>>Are the any tools for parsing and extracting data from Hadoop's Job Logs? I want to do stuff like .. 
>>>
>>>
>>>1. Getting run time of each map/reduce task
>>>2. Total map/reduce tasks ran on a particular node in that job  and some similar stuff
>>>
>>>
>>>Any suggestions?
>>>
>>>
>>>
>>>Thanks
>>
>
>
>
>-- 
>Regards,
>Bharath .V
>w:http://researchweb.iiit.ac.in/~bharath.v
>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Raj Vishwanathan <ra...@yahoo.com>.
Take a look at 

https://github.com/rajvish/hadoop-summary




>________________________________
> From: bharath vissapragada <bh...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, October 29, 2012 10:03 PM
>Subject: Re: Tools for extracting data from hadoop logs
> 
>
>Hi Binglin,
>
>
>Great scripts ..Thanks for sharing :D
>
>
>Regards,
>
>
>On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:
>
>Hi,
>>
>>
>>I think you want to analyze hadoop job logs in jobtracker history folder? These logs are in a centralized folder and don't need tools like flume or scribe to gather them.
>>I used to write a simple python script to parse those log files, and generate csv/json reports, basically you can use it to get execution time, counter, status of job, taks, attempts, maybe you can modify it to meet you needs.
>>
>>
>>Thanks,
>>Binglin
>>
>>
>>
>>
>>On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <bh...@gmail.com> wrote:
>>
>>Hi list,
>>>
>>>
>>>Are the any tools for parsing and extracting data from Hadoop's Job Logs? I want to do stuff like .. 
>>>
>>>
>>>1. Getting run time of each map/reduce task
>>>2. Total map/reduce tasks ran on a particular node in that job  and some similar stuff
>>>
>>>
>>>Any suggestions?
>>>
>>>
>>>
>>>Thanks
>>
>
>
>
>-- 
>Regards,
>Bharath .V
>w:http://researchweb.iiit.ac.in/~bharath.v
>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Raj Vishwanathan <ra...@yahoo.com>.
Take a look at 

https://github.com/rajvish/hadoop-summary




>________________________________
> From: bharath vissapragada <bh...@gmail.com>
>To: user@hadoop.apache.org 
>Sent: Monday, October 29, 2012 10:03 PM
>Subject: Re: Tools for extracting data from hadoop logs
> 
>
>Hi Binglin,
>
>
>Great scripts ..Thanks for sharing :D
>
>
>Regards,
>
>
>On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:
>
>Hi,
>>
>>
>>I think you want to analyze hadoop job logs in jobtracker history folder? These logs are in a centralized folder and don't need tools like flume or scribe to gather them.
>>I used to write a simple python script to parse those log files, and generate csv/json reports, basically you can use it to get execution time, counter, status of job, taks, attempts, maybe you can modify it to meet you needs.
>>
>>
>>Thanks,
>>Binglin
>>
>>
>>
>>
>>On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <bh...@gmail.com> wrote:
>>
>>Hi list,
>>>
>>>
>>>Are the any tools for parsing and extracting data from Hadoop's Job Logs? I want to do stuff like .. 
>>>
>>>
>>>1. Getting run time of each map/reduce task
>>>2. Total map/reduce tasks ran on a particular node in that job  and some similar stuff
>>>
>>>
>>>Any suggestions?
>>>
>>>
>>>
>>>Thanks
>>
>
>
>
>-- 
>Regards,
>Bharath .V
>w:http://researchweb.iiit.ac.in/~bharath.v
>
>
>

Re: Tools for extracting data from hadoop logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Binglin,

Great scripts ..Thanks for sharing :D

Regards,

On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: Tools for extracting data from hadoop logs

Posted by Manoj Babu <ma...@gmail.com>.
Much useful one thanks Binglin for sharing it!

Cheers!
Manoj.



On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Manoj Babu <ma...@gmail.com>.
Much useful one thanks Binglin for sharing it!

Cheers!
Manoj.



On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Manoj Babu <ma...@gmail.com>.
Much useful one thanks Binglin for sharing it!

Cheers!
Manoj.



On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>

Re: Tools for extracting data from hadoop logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Binglin,

Great scripts ..Thanks for sharing :D

Regards,

On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: Tools for extracting data from hadoop logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Binglin,

Great scripts ..Thanks for sharing :D

Regards,

On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: Tools for extracting data from hadoop logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi Binglin,

Great scripts ..Thanks for sharing :D

Regards,

On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: Tools for extracting data from hadoop logs

Posted by Manoj Babu <ma...@gmail.com>.
Much useful one thanks Binglin for sharing it!

Cheers!
Manoj.



On Tue, Oct 30, 2012 at 8:54 AM, Binglin Chang <de...@gmail.com> wrote:

> Hi,
>
> I think you want to analyze hadoop job logs in jobtracker history folder?
> These logs are in a centralized folder and don't need tools like flume or
> scribe to gather them.
> I used to write a simple python script to parse those log files, and
> generate csv/json reports, basically you can use it to get execution time,
> counter, status of job, taks, attempts, maybe you can modify it to meet you
> needs.
>
> Thanks,
> Binglin
>
>
> On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
> bharathvissapragada1990@gmail.com> wrote:
>
>> Hi list,
>>
>> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
>> I want to do stuff like ..
>>
>> 1. Getting run time of each map/reduce task
>> 2. Total map/reduce tasks ran on a particular node in that job  and some
>> similar stuff
>>
>> Any suggestions?
>>
>> Thanks
>>
>
>

Re: Tools for extracting data from hadoop logs

Posted by Binglin Chang <de...@gmail.com>.
Hi,

I think you want to analyze hadoop job logs in jobtracker history folder?
These logs are in a centralized folder and don't need tools like flume or
scribe to gather them.
I used to write a simple python script to parse those log files, and
generate csv/json reports, basically you can use it to get execution time,
counter, status of job, taks, attempts, maybe you can modify it to meet you
needs.

Thanks,
Binglin


On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by anand sharma <an...@gmail.com>.
Hi bharath Apache Flume is there and also if you want to take a look Scribe
from Facebook is also there then there are other log aggregation tool too.

On Tue, Oct 30, 2012 at 7:18 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by Binglin Chang <de...@gmail.com>.
Hi,

I think you want to analyze hadoop job logs in jobtracker history folder?
These logs are in a centralized folder and don't need tools like flume or
scribe to gather them.
I used to write a simple python script to parse those log files, and
generate csv/json reports, basically you can use it to get execution time,
counter, status of job, taks, attempts, maybe you can modify it to meet you
needs.

Thanks,
Binglin


On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by anand sharma <an...@gmail.com>.
Hi bharath Apache Flume is there and also if you want to take a look Scribe
from Facebook is also there then there are other log aggregation tool too.

On Tue, Oct 30, 2012 at 7:18 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by Binglin Chang <de...@gmail.com>.
Hi,

I think you want to analyze hadoop job logs in jobtracker history folder?
These logs are in a centralized folder and don't need tools like flume or
scribe to gather them.
I used to write a simple python script to parse those log files, and
generate csv/json reports, basically you can use it to get execution time,
counter, status of job, taks, attempts, maybe you can modify it to meet you
needs.

Thanks,
Binglin


On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by anand sharma <an...@gmail.com>.
Hi bharath Apache Flume is there and also if you want to take a look Scribe
from Facebook is also there then there are other log aggregation tool too.

On Tue, Oct 30, 2012 at 7:18 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by anand sharma <an...@gmail.com>.
Hi bharath Apache Flume is there and also if you want to take a look Scribe
from Facebook is also there then there are other log aggregation tool too.

On Tue, Oct 30, 2012 at 7:18 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>

Re: Tools for extracting data from hadoop logs

Posted by Binglin Chang <de...@gmail.com>.
Hi,

I think you want to analyze hadoop job logs in jobtracker history folder?
These logs are in a centralized folder and don't need tools like flume or
scribe to gather them.
I used to write a simple python script to parse those log files, and
generate csv/json reports, basically you can use it to get execution time,
counter, status of job, taks, attempts, maybe you can modify it to meet you
needs.

Thanks,
Binglin


On Tue, Oct 30, 2012 at 9:48 AM, bharath vissapragada <
bharathvissapragada1990@gmail.com> wrote:

> Hi list,
>
> Are the any tools for parsing and extracting data from Hadoop's Job Logs?
> I want to do stuff like ..
>
> 1. Getting run time of each map/reduce task
> 2. Total map/reduce tasks ran on a particular node in that job  and some
> similar stuff
>
> Any suggestions?
>
> Thanks
>