You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by jiang licht <li...@yahoo.com> on 2010/03/18 23:09:45 UTC

performance analysis?

To test bottle neck, I tried to figure out if some processes/threads are often blocked and wait for either disk or network i/o and why if either mapper or reducer runs slow. In my case, on each slave, up to 12 mappers are allowed to run simultaneously. CPU are more than 90% of time in idle mode and about at most 2% in iowait. But I found most mappers (from "top" and "jps")  were in sleep and strace shows that they (including tasktracker and datanode) were blocked on futex(0x4035b9d0, FUTEX_WAIT, 12566, NULL,

Here's a list of accumulated open files (including network, pipe, socket, etc) of data node grouped by type;

IPv6 15
unix 1
DIR 2
CHR 4
0000 17
REG 122
sock 1
FIFO 34

Here's a list of accumulated open files (including network, pipe, 
socket, etc) of task tracker grouped by type;

IPv6 24
unix 1
DIR 2
CHR 4
0000 4
REG 105
sock 1
FIFO 50

Here's a typical mapper thread:

IPv6 2
unix 1
0000 1
DIR 4
sock 1
FIFO 2
CHR 6
REG 106

A mapper would block on futex for about a minute or so. It seems to me that various i/o cannot catch up with CPU. Would it be helpful to increase some buffer parameters to handle this? OR does this stats imply sth else? BTW, what is an effective way to analyze peformance of a hadoop cluster and what about good tools? Any recommendations?

Thanks,

Michael

Re: performance analysis?

Posted by jiang licht <li...@yahoo.com>.

Thanks, Ninad. This really helps.

Best regards,

Michael

--- On Fri, 3/19/10, Ninad Raut <hb...@gmail.com> wrote:

From: Ninad Raut <hb...@gmail.com>
Subject: Re: performance analysis?
To: common-user@hadoop.apache.org
Date: Friday, March 19, 2010, 12:02 AM

The Best and Easy to Configure tool is Ganglia. Haoop has built in support
gor Ganglia. Check out YDN Ganglia setup steps and you will be able to
monitor ur CPU and Mapr Reduce Jobs as well.

TO monitor Network Related aspects you can check out Nagios.

Regards,
Ninad R

On Fri, Mar 19, 2010 at 3:39 AM, jiang licht <li...@yahoo.com> wrote:

> To test bottle neck, I tried to figure out if some processes/threads are
> often blocked and wait for either disk or network i/o and why if either
> mapper or reducer runs slow. In my case, on each slave, up to 12 mappers are
> allowed to run simultaneously. CPU are more than 90% of time in idle mode
> and about at most 2% in iowait. But I found most mappers (from "top" and
> "jps")  were in sleep and strace shows that they (including tasktracker and
> datanode) were blocked on futex(0x4035b9d0, FUTEX_WAIT, 12566, NULL,
>
> Here's a list of accumulated open files (including network, pipe, socket,
> etc) of data node grouped by type;
>
> IPv6 15
> unix 1
> DIR 2
> CHR 4
> 0000 17
> REG 122
> sock 1
> FIFO 34
>
> Here's a list of accumulated open files (including network, pipe,
> socket, etc) of task tracker grouped by type;
>
> IPv6 24
> unix 1
> DIR 2
> CHR 4
> 0000 4
> REG 105
> sock 1
> FIFO 50
>
> Here's a typical mapper thread:
>
> IPv6 2
> unix 1
> 0000 1
> DIR 4
> sock 1
> FIFO 2
> CHR 6
> REG 106
>
> A mapper would block on futex for about a minute or so. It seems to me that
> various i/o cannot catch up with CPU. Would it be helpful to increase some
> buffer parameters to handle this? OR does this stats imply sth else? BTW,
> what is an effective way to analyze peformance of a hadoop cluster and what
> about good tools? Any recommendations?
>
> Thanks,
>
> Michael
>
>
>

Re: performance analysis?

Posted by Ninad Raut <hb...@gmail.com>.

The Best and Easy to Configure tool is Ganglia. Haoop has built in support
gor Ganglia. Check out YDN Ganglia setup steps and you will be able to
monitor ur CPU and Mapr Reduce Jobs as well.

TO monitor Network Related aspects you can check out Nagios.

Regards,
Ninad R

On Fri, Mar 19, 2010 at 3:39 AM, jiang licht <li...@yahoo.com> wrote:

> To test bottle neck, I tried to figure out if some processes/threads are
> often blocked and wait for either disk or network i/o and why if either
> mapper or reducer runs slow. In my case, on each slave, up to 12 mappers are
> allowed to run simultaneously. CPU are more than 90% of time in idle mode
> and about at most 2% in iowait. But I found most mappers (from "top" and
> "jps")  were in sleep and strace shows that they (including tasktracker and
> datanode) were blocked on futex(0x4035b9d0, FUTEX_WAIT, 12566, NULL,
>
> Here's a list of accumulated open files (including network, pipe, socket,
> etc) of data node grouped by type;
>
> IPv6 15
> unix 1
> DIR 2
> CHR 4
> 0000 17
> REG 122
> sock 1
> FIFO 34
>
> Here's a list of accumulated open files (including network, pipe,
> socket, etc) of task tracker grouped by type;
>
> IPv6 24
> unix 1
> DIR 2
> CHR 4
> 0000 4
> REG 105
> sock 1
> FIFO 50
>
> Here's a typical mapper thread:
>
> IPv6 2
> unix 1
> 0000 1
> DIR 4
> sock 1
> FIFO 2
> CHR 6
> REG 106
>
> A mapper would block on futex for about a minute or so. It seems to me that
> various i/o cannot catch up with CPU. Would it be helpful to increase some
> buffer parameters to handle this? OR does this stats imply sth else? BTW,
> what is an effective way to analyze peformance of a hadoop cluster and what
> about good tools? Any recommendations?
>
> Thanks,
>
> Michael
>
>
>