You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by Keren Ouaknine <ke...@gmail.com> on 2011/06/26 14:16:09 UTC

mappers

Hello,

I am looking for the actual number of mappers on each machine for the job. I
know how to configure the max number ("mapred.tasktracker.map.tasks.maximum"
in mapred-site.xml file), but not the actual number of mappers that were
running for a completed job.

Any idea where can I find this data?
Thanks,
Keren

-- 
Keren Ouaknine
Cell: +972 54 2565404
Web: www.kereno.com

Re: mappers

Posted by Josh Wills <jo...@gmail.com>.
(redirected to mapreduce-user@, mapreduce-dev bcc'd)

The param you're referring to controls the maximum number of
simultaneously active mappers on a given task tracker, i.e., how many
map slots are available on that node. But a single task tracker can be
used for multiple MR jobs, so you can't look at the metrics for the
task tracker to see how many mappers ran on a job. For a single job,
the total number of mappers that are run == the number of input
splits.

Hoping that anyone who knows this stuff better than I do will reply to
correct any mistakes in my answer,
Josh

On Sun, Jun 26, 2011 at 5:16 AM, Keren Ouaknine <ke...@gmail.com> wrote:
> Hello,
>
> I am looking for the actual number of mappers on each machine for the job. I
> know how to configure the max number ("mapred.tasktracker.map.tasks.maximum"
> in mapred-site.xml file), but not the actual number of mappers that were
> running for a completed job.
>
> Any idea where can I find this data?
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> Cell: +972 54 2565404
> Web: www.kereno.com
>

Re: mappers

Posted by Josh Wills <jo...@gmail.com>.
(redirected to mapreduce-user@, mapreduce-dev bcc'd)

The param you're referring to controls the maximum number of
simultaneously active mappers on a given task tracker, i.e., how many
map slots are available on that node. But a single task tracker can be
used for multiple MR jobs, so you can't look at the metrics for the
task tracker to see how many mappers ran on a job. For a single job,
the total number of mappers that are run == the number of input
splits.

Hoping that anyone who knows this stuff better than I do will reply to
correct any mistakes in my answer,
Josh

On Sun, Jun 26, 2011 at 5:16 AM, Keren Ouaknine <ke...@gmail.com> wrote:
> Hello,
>
> I am looking for the actual number of mappers on each machine for the job. I
> know how to configure the max number ("mapred.tasktracker.map.tasks.maximum"
> in mapred-site.xml file), but not the actual number of mappers that were
> running for a completed job.
>
> Any idea where can I find this data?
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> Cell: +972 54 2565404
> Web: www.kereno.com
>

Re: mappers

Posted by Owen O'Malley <ow...@gmail.com>.
Look in the job history file. It has a line for each event of the job
including task start and finish.

-- Owen

On Jun 26, 2011, at 2:17 AM, Keren Ouaknine <ke...@gmail.com> wrote:

> Hello,
>
> I am looking for the actual number of mappers on each machine for the job. I
> know how to configure the max number ("mapred.tasktracker.map.tasks.maximum"
> in mapred-site.xml file), but not the actual number of mappers that were
> running for a completed job.
>
> Any idea where can I find this data?
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> Cell: +972 54 2565404
> Web: www.kereno.com