You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Pedro Costa <ps...@gmail.com> on 2011/02/12 17:04:53 UTC
Map tasks execution
Hi,
1 - When a Map task is taking too long to finish its process, the JT
launches another Map task to process. This means that the task that
was replaced is killed?
2 - Does Hadoop MR allows that the same input split be processed by 2
different mappers at the same time?
Thanks,
--
Pedro
Re: Map tasks execution
Posted by Harsh J <qw...@gmail.com>.
Hello,
On Sat, Feb 12, 2011 at 9:34 PM, Pedro Costa <ps...@gmail.com> wrote:
> Hi,
>
> 1 - When a Map task is taking too long to finish its process, the JT
> launches another Map task to process. This means that the task that
> was replaced is killed?
If a task times out, it is killed and rescheduled. If you're noticing
this in the final waves, it could be the speculative execution feature
of Hadoop MapReduce - enabled by defaults.
> 2 - Does Hadoop MR allows that the same input split be processed by 2
> different mappers at the same time?
In some ways, yes.
There is a speculative execution feature that does this exact thing
(two tasks may be 'computing' in the same race - whichever reports a
completion first, wins). See the 'Speculative execution' sub-topic of
this YDN Hadoop modules page for some details:
http://developer.yahoo.com/hadoop/tutorial/module4.html#tolerence
But it should also be possible to have duplicated input splits / paths
in order to do this (although the 'same-time' is not a guarantee,
again).
--
Harsh J
www.harshj.com