You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Alexander Mont <al...@comcast.net> on 2008/02/26 05:16:39 UTC
Profiling in Hadoop
I am interested in examining a MapReduce execution in order to determine
the amount of time it takes to execute each of the following parts of a
MapReduce job:
- Loading of data onto mappers
- Executing map operation
- Sorting/partitioning of map output
- Loading map output onto reducers
- Executing reduce operation
- Writing output of reducers to data files
Does Hadoop have a built-in mechanism to profile a MapReduce application
in this way, and if not, have other mechanisms to do this been developed?
-Alex Mont
Re: Profiling in Hadoop
Posted by Martin Traverso <mt...@gmail.com>.
On Mon, Feb 25, 2008 at 9:26 PM, Amar Kamat <am...@yahoo-inc.com> wrote:
> a different profiler but had to modify the Hadoop code. Not sure if there
> is a way to pass -agentlib param to the child jvms.
I filed this earlier today because I ran into the same issue:
https://issues.apache.org/jira/browse/HADOOP-2895
I have a patch for it, which I'll upload shortly (after fixing code
formatting and style issues).
Martin
Re: Profiling in Hadoop
Posted by Amar Kamat <am...@yahoo-inc.com>.
You can turn on the default profiler using mapred.task.profile,
mapred.task.profile.maps and mapred.task.profile.reduces. I have used
a different profiler but had to modify the Hadoop code. Not sure if there
is a way to pass -agentlib param to the child jvms.
Amar
On Mon, 25 Feb 2008, Alexander Mont wrote:
> I am interested in examining a MapReduce execution in order to determine the
> amount of time it takes to execute each of the following parts of a MapReduce
> job:
>
> - Loading of data onto mappers
> - Executing map operation
> - Sorting/partitioning of map output
> - Loading map output onto reducers
> - Executing reduce operation
> - Writing output of reducers to data files
>
> Does Hadoop have a built-in mechanism to profile a MapReduce application in
> this way, and if not, have other mechanisms to do this been developed?
>
> -Alex Mont
>
Re: Profiling in Hadoop
Posted by Tom White <to...@gmail.com>.
Have a look at the mapred.task.profile property - this should help you
get some of the information you need.
Tom
On 25/02/2008, Alexander Mont <al...@comcast.net> wrote:
> I am interested in examining a MapReduce execution in order to determine
> the amount of time it takes to execute each of the following parts of a
> MapReduce job:
>
> - Loading of data onto mappers
> - Executing map operation
> - Sorting/partitioning of map output
> - Loading map output onto reducers
> - Executing reduce operation
> - Writing output of reducers to data files
>
> Does Hadoop have a built-in mechanism to profile a MapReduce application
> in this way, and if not, have other mechanisms to do this been developed?
>
> -Alex Mont
>