You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alexander Mont <al...@comcast.net> on 2008/02/26 05:16:39 UTC

Profiling in Hadoop

I am interested in examining a MapReduce execution in order to determine 
the amount of time it takes to execute each of the following parts of a 
MapReduce job:

- Loading of data onto mappers
- Executing map operation
- Sorting/partitioning of map output
- Loading map output onto reducers
- Executing reduce operation
- Writing output of reducers to data files

Does Hadoop have a built-in mechanism to profile a MapReduce application 
in this way, and if not, have other mechanisms to do this been developed?

-Alex Mont

Re: Profiling in Hadoop

Posted by Martin Traverso <mt...@gmail.com>.
On Mon, Feb 25, 2008 at 9:26 PM, Amar Kamat <am...@yahoo-inc.com> wrote:

> a different profiler but had to modify the Hadoop code. Not sure if there
> is a way to pass -agentlib param to the child jvms.


I filed this earlier today because I ran into the same issue:

https://issues.apache.org/jira/browse/HADOOP-2895

I have a patch for it, which I'll upload shortly (after fixing code
formatting and style issues).

Martin

Re: Profiling in Hadoop

Posted by Amar Kamat <am...@yahoo-inc.com>.
You can turn on the default profiler using mapred.task.profile,
mapred.task.profile.maps and mapred.task.profile.reduces. I have used 
a different profiler but had to modify the Hadoop code. Not sure if there 
is a way to pass -agentlib param to the child jvms.
Amar
On Mon, 25 Feb 2008, Alexander Mont wrote:

> I am interested in examining a MapReduce execution in order to determine the 
> amount of time it takes to execute each of the following parts of a MapReduce 
> job:
>
> - Loading of data onto mappers
> - Executing map operation
> - Sorting/partitioning of map output
> - Loading map output onto reducers
> - Executing reduce operation
> - Writing output of reducers to data files
>
> Does Hadoop have a built-in mechanism to profile a MapReduce application in 
> this way, and if not, have other mechanisms to do this been developed?
>
> -Alex Mont
>

Re: Profiling in Hadoop

Posted by Tom White <to...@gmail.com>.
Have a look at the mapred.task.profile property - this should help you
get some of the information you need.

Tom

On 25/02/2008, Alexander Mont <al...@comcast.net> wrote:
> I am interested in examining a MapReduce execution in order to determine
>  the amount of time it takes to execute each of the following parts of a
>  MapReduce job:
>
>  - Loading of data onto mappers
>  - Executing map operation
>  - Sorting/partitioning of map output
>  - Loading map output onto reducers
>  - Executing reduce operation
>  - Writing output of reducers to data files
>
>  Does Hadoop have a built-in mechanism to profile a MapReduce application
>  in this way, and if not, have other mechanisms to do this been developed?
>
>  -Alex Mont
>