You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Eric Payne (Updated) (JIRA)" <ji...@apache.org> on 2012/01/17 17:21:40 UTC

[jira] [Updated] (MAPREDUCE-3403) Speculative Execution: Kill a reducer when queue is full and a mapper

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eric Payne updated MAPREDUCE-3403:
----------------------------------

    Issue Type: New Feature  (was: Bug)
       Summary: Speculative Execution: Kill a reducer when queue is full and a mapper   (was: Speculative Execution: enabling multiple reduce tasks inhibit spec exec launch of mappers )

The reason we see this behavior is because the queue's containers are all in use at the time of the speculation. So, the speculated mapper gets requested, but as long as all of the containers are occupied, the speculated mapper task will never be scheduled.

For e.g., 
|Requested Tasks|Used Containers|Completed Tasks|
|Map0 prime|Map0 (long running||
||Reduce 0|Map1, Map2|
||Reduce 1|Map3|


If all of the containers of a queue are consumed with running mapred task attempts (both maps and reduces) and
speculation of a mapper needs to run, would it be reasonable to have the MRAM kill one of the reduce tasks so the
speculative map task can run?

                
> Speculative Execution: Kill a reducer when queue is full and a mapper 
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3403
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3403
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: job submission
>    Affects Versions: 0.23.0
>         Environment: Hadoop version is: Hadoop 0.23.0.1110031628
> 10 node test cluster
>            Reporter: patrick white
>            Assignee: Eric Payne
>
> When forcing multiple reduce tasks to be launched by applying the setNumReduceTasks() method on a Job object, and
> running on input data which has one significantly longer map (and consequently reduce) task;
> - a speculative reduce task was not launched, even with a longer running reducer only 4 reduce tasks were launched
> - the spec launch of map tasks was inhibited by the setNumReduceTasks() method applied, so even with
> -Dmapreduce.job.maps.speculative.execution=true we only had 4 map tasks launched. The exact same code with the
> setNumReduceTasks() method taken out, and on the same input data set, consistently launched 5 mappers as expected.
> Testing info:
> 3. modified WordCount to force 4 reducers being launched, by adding:
>     job.setNumReduceTasks(4); // hardwire 4 reducers for now
>     System.out.println("\nTESTDEBUG: using 4 reduce tasks for now\n\n");
> to the Job object. This causes 4 reduce tasks to be launched, oddly though it inhibits the map task from speculative
> launch. So the same job code, without the setNumReduceTasks() method, will launch 5 mappers as described in case #2.
> When this method is added, that same job will only launch 4 mappers, as well as 4 reducers, otherwise the job
> successfully completes.
> output snippet with setNumReduceTasks():
>         org.apache.hadoop.mapreduce.JobCounter
>                 TOTAL_LAUNCHED_MAPS=4
>                 TOTAL_LAUNCHED_REDUCES=4
>                 RACK_LOCAL_MAPS=4
>                 SLOTS_MILLIS_MAPS=190787
>                 SLOTS_MILLIS_REDUCES=572554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira