You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Maysam Yabandeh (JIRA)" <ji...@apache.org> on 2014/06/12 04:44:02 UTC

[jira] [Updated] (MAPREDUCE-5844) Reducer Preemption is too aggressive

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maysam Yabandeh updated MAPREDUCE-5844:
---------------------------------------

    Attachment: MAPREDUCE-5844.patch

Attaching the new patch that also contains the unit test and updated name for the conf param.

[~kasha], as per your suggestion quite a few visibilities in the source code are relaxed (tagged with @VisibleForTesting) to allow testing with reasonable complexity. The patch includes a test of preemptReducesIfNeed for both before and after the changes made by this jira.

[~jlowe], as per your suggestion the conf param name is updated and documented in mapreduce-default.xml.

> Reducer Preemption is too aggressive
> ------------------------------------
>
>                 Key: MAPREDUCE-5844
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch
>
>
> We observed cases where the reducer preemption makes the job finish much later, and the preemption does not seem to be necessary since after preemption both the preempted reducer and the mapper are assigned immediately--meaning that there was already enough space for the mapper.
> The logic for triggering preemption is at RMContainerAllocator::preemptReducesIfNeeded
> The preemption is triggered if the following is true:
> {code}
> headroom +  am * |m| + pr * |r| < mapResourceRequest
> {code} 
> where am: number of assigned mappers, |m| is mapper size, pr is number of reducers being preempted, and |r| is the reducer size.
> The original idea apparently was that if headroom is not big enough for the new mapper requests, reducers should be preempted. This would work if the job is alone in the cluster. Once we have queues, the headroom calculation becomes more complicated and it would require a separate headroom calculation per queue/job.
> So, as a result headroom variable is kind of given up currently: *headroom is always set to 0* What this implies to the speculation is that speculation becomes very aggressive, not considering whether there is enough space for the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)