You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2012/11/21 19:55:18 UTC

MapReduce logs

Hi,

When we run a MapReduce job, the logs are stored on all the tasktracker nodes.

Is there an easy way to agregate all those logs together and see them
in a single place instead of going to the tasks one by one and open
the file?

Thanks,

JM

Re: MapReduce logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi,

I'm not sure if my answer can be applied in your case, but sharing because
I found it interesting !

I saw a cluster where the path containing "Hadoop installation directory"
is on an NFS and is mounted across all the slaves in the same path. This
has 2 advantages,

1) Logs are all written to the same NFS and there is no need of aggregation
2) Hadoop upgrade becomes easy as we just need to update the tar in one
location, basically maintenance becomes easy.

But make sure you configure data directories to local disks of slaves,
otherwise they end of writing everything to NFS !


Thanks,





On Thu, Nov 22, 2012 at 1:01 AM, Dino Kečo <di...@gmail.com> wrote:

> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: MapReduce logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi,

I'm not sure if my answer can be applied in your case, but sharing because
I found it interesting !

I saw a cluster where the path containing "Hadoop installation directory"
is on an NFS and is mounted across all the slaves in the same path. This
has 2 advantages,

1) Logs are all written to the same NFS and there is no need of aggregation
2) Hadoop upgrade becomes easy as we just need to update the tar in one
location, basically maintenance becomes easy.

But make sure you configure data directories to local disks of slaves,
otherwise they end of writing everything to NFS !


Thanks,





On Thu, Nov 22, 2012 at 1:01 AM, Dino Kečo <di...@gmail.com> wrote:

> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: MapReduce logs

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks for the info.

I have quickly draft this bash script in case it can help someone...
You just neeed to make sure the IP inside is replaced.
To call it, you need to give the job task page.

./showLogs.sh "http://192.168.23.7:50030/jobtasks.jsp?jobid=job_201211211408_0001&type=map&pagenum=1"

Then you can redirect the output, or do what ever you want.

I was wondering if there was a "nicer" solution...

:~/test$ cat showLogs.sh
#!/bin/bash
rm -f tasks.html
wget --quiet --output-document tasks.html $1
for i in `cat tasks.html | grep taskdetails | cut -d"\"" -f2 | grep
taskdetails`; do
	rm -f tasksdetails.html
	wget --quiet --output-document tasksdetails.html http://192.168.23.7:50030/$i
	for j in `cat tasksdetails.html | grep "all=true" | cut -d"\"" -f6`; do
		printf "*"%.0s {1..80}
		echo
		echo $j
		printf "*"%.0s {1..80}
		echo
		rm -f logs.txt
		wget --quiet --output-document logs.txt $j
		tail -n +31 logs.txt | head -n -2
	done
done
rm -f tasks.html
rm -f tasksdetails.html
rm -f logs.txt


2012/11/21, Dino Kečo <di...@gmail.com>:
> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>

Re: MapReduce logs

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks for the info.

I have quickly draft this bash script in case it can help someone...
You just neeed to make sure the IP inside is replaced.
To call it, you need to give the job task page.

./showLogs.sh "http://192.168.23.7:50030/jobtasks.jsp?jobid=job_201211211408_0001&type=map&pagenum=1"

Then you can redirect the output, or do what ever you want.

I was wondering if there was a "nicer" solution...

:~/test$ cat showLogs.sh
#!/bin/bash
rm -f tasks.html
wget --quiet --output-document tasks.html $1
for i in `cat tasks.html | grep taskdetails | cut -d"\"" -f2 | grep
taskdetails`; do
	rm -f tasksdetails.html
	wget --quiet --output-document tasksdetails.html http://192.168.23.7:50030/$i
	for j in `cat tasksdetails.html | grep "all=true" | cut -d"\"" -f6`; do
		printf "*"%.0s {1..80}
		echo
		echo $j
		printf "*"%.0s {1..80}
		echo
		rm -f logs.txt
		wget --quiet --output-document logs.txt $j
		tail -n +31 logs.txt | head -n -2
	done
done
rm -f tasks.html
rm -f tasksdetails.html
rm -f logs.txt


2012/11/21, Dino Kečo <di...@gmail.com>:
> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>

Re: MapReduce logs

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks for the info.

I have quickly draft this bash script in case it can help someone...
You just neeed to make sure the IP inside is replaced.
To call it, you need to give the job task page.

./showLogs.sh "http://192.168.23.7:50030/jobtasks.jsp?jobid=job_201211211408_0001&type=map&pagenum=1"

Then you can redirect the output, or do what ever you want.

I was wondering if there was a "nicer" solution...

:~/test$ cat showLogs.sh
#!/bin/bash
rm -f tasks.html
wget --quiet --output-document tasks.html $1
for i in `cat tasks.html | grep taskdetails | cut -d"\"" -f2 | grep
taskdetails`; do
	rm -f tasksdetails.html
	wget --quiet --output-document tasksdetails.html http://192.168.23.7:50030/$i
	for j in `cat tasksdetails.html | grep "all=true" | cut -d"\"" -f6`; do
		printf "*"%.0s {1..80}
		echo
		echo $j
		printf "*"%.0s {1..80}
		echo
		rm -f logs.txt
		wget --quiet --output-document logs.txt $j
		tail -n +31 logs.txt | head -n -2
	done
done
rm -f tasks.html
rm -f tasksdetails.html
rm -f logs.txt


2012/11/21, Dino Kečo <di...@gmail.com>:
> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>

Re: MapReduce logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi,

I'm not sure if my answer can be applied in your case, but sharing because
I found it interesting !

I saw a cluster where the path containing "Hadoop installation directory"
is on an NFS and is mounted across all the slaves in the same path. This
has 2 advantages,

1) Logs are all written to the same NFS and there is no need of aggregation
2) Hadoop upgrade becomes easy as we just need to update the tar in one
location, basically maintenance becomes easy.

But make sure you configure data directories to local disks of slaves,
otherwise they end of writing everything to NFS !


Thanks,





On Thu, Nov 22, 2012 at 1:01 AM, Dino Kečo <di...@gmail.com> wrote:

> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: MapReduce logs

Posted by bharath vissapragada <bh...@gmail.com>.
Hi,

I'm not sure if my answer can be applied in your case, but sharing because
I found it interesting !

I saw a cluster where the path containing "Hadoop installation directory"
is on an NFS and is mounted across all the slaves in the same path. This
has 2 advantages,

1) Logs are all written to the same NFS and there is no need of aggregation
2) Hadoop upgrade becomes easy as we just need to update the tar in one
location, basically maintenance becomes easy.

But make sure you configure data directories to local disks of slaves,
otherwise they end of writing everything to NFS !


Thanks,





On Thu, Nov 22, 2012 at 1:01 AM, Dino Kečo <di...@gmail.com> wrote:

> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>
>


-- 
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v

Re: MapReduce logs

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Thanks for the info.

I have quickly draft this bash script in case it can help someone...
You just neeed to make sure the IP inside is replaced.
To call it, you need to give the job task page.

./showLogs.sh "http://192.168.23.7:50030/jobtasks.jsp?jobid=job_201211211408_0001&type=map&pagenum=1"

Then you can redirect the output, or do what ever you want.

I was wondering if there was a "nicer" solution...

:~/test$ cat showLogs.sh
#!/bin/bash
rm -f tasks.html
wget --quiet --output-document tasks.html $1
for i in `cat tasks.html | grep taskdetails | cut -d"\"" -f2 | grep
taskdetails`; do
	rm -f tasksdetails.html
	wget --quiet --output-document tasksdetails.html http://192.168.23.7:50030/$i
	for j in `cat tasksdetails.html | grep "all=true" | cut -d"\"" -f6`; do
		printf "*"%.0s {1..80}
		echo
		echo $j
		printf "*"%.0s {1..80}
		echo
		rm -f logs.txt
		wget --quiet --output-document logs.txt $j
		tail -n +31 logs.txt | head -n -2
	done
done
rm -f tasks.html
rm -f tasksdetails.html
rm -f logs.txt


2012/11/21, Dino Kečo <di...@gmail.com>:
> Hi,
>
> We had similar requirement and we built small Java application which gets
> information about task nodes from Job Tracker and download logs into one
> file using URLs of each task tracker.
>
> For huge logs this becomes slow and time consuming.
>
> Hope this helps.
>
> Regards,
> Dino Kečo
> msn: xdinno@hotmail.com
> mail: dino.keco@gmail.com
> skype: dino.keco
> phone: +387 61 507 851
>
>
> On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi,
>>
>> When we run a MapReduce job, the logs are stored on all the tasktracker
>> nodes.
>>
>> Is there an easy way to agregate all those logs together and see them
>> in a single place instead of going to the tasks one by one and open
>> the file?
>>
>> Thanks,
>>
>> JM
>>
>

Re: MapReduce logs

Posted by Dino Kečo <di...@gmail.com>.
Hi,

We had similar requirement and we built small Java application which gets
information about task nodes from Job Tracker and download logs into one
file using URLs of each task tracker.

For huge logs this becomes slow and time consuming.

Hope this helps.

Regards,
Dino Kečo
msn: xdinno@hotmail.com
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851


On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi,
>
> When we run a MapReduce job, the logs are stored on all the tasktracker
> nodes.
>
> Is there an easy way to agregate all those logs together and see them
> in a single place instead of going to the tasks one by one and open
> the file?
>
> Thanks,
>
> JM
>

Re: MapReduce logs

Posted by Dino Kečo <di...@gmail.com>.
Hi,

We had similar requirement and we built small Java application which gets
information about task nodes from Job Tracker and download logs into one
file using URLs of each task tracker.

For huge logs this becomes slow and time consuming.

Hope this helps.

Regards,
Dino Kečo
msn: xdinno@hotmail.com
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851


On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi,
>
> When we run a MapReduce job, the logs are stored on all the tasktracker
> nodes.
>
> Is there an easy way to agregate all those logs together and see them
> in a single place instead of going to the tasks one by one and open
> the file?
>
> Thanks,
>
> JM
>

Re: MapReduce logs

Posted by Dino Kečo <di...@gmail.com>.
Hi,

We had similar requirement and we built small Java application which gets
information about task nodes from Job Tracker and download logs into one
file using URLs of each task tracker.

For huge logs this becomes slow and time consuming.

Hope this helps.

Regards,
Dino Kečo
msn: xdinno@hotmail.com
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851


On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi,
>
> When we run a MapReduce job, the logs are stored on all the tasktracker
> nodes.
>
> Is there an easy way to agregate all those logs together and see them
> in a single place instead of going to the tasks one by one and open
> the file?
>
> Thanks,
>
> JM
>

Re: MapReduce logs

Posted by Dino Kečo <di...@gmail.com>.
Hi,

We had similar requirement and we built small Java application which gets
information about task nodes from Job Tracker and download logs into one
file using URLs of each task tracker.

For huge logs this becomes slow and time consuming.

Hope this helps.

Regards,
Dino Kečo
msn: xdinno@hotmail.com
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851


On Wed, Nov 21, 2012 at 7:55 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi,
>
> When we run a MapReduce job, the logs are stored on all the tasktracker
> nodes.
>
> Is there an easy way to agregate all those logs together and see them
> in a single place instead of going to the tasks one by one and open
> the file?
>
> Thanks,
>
> JM
>