You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2012/07/26 18:29:35 UTC

[jira] [Updated] (MAPREDUCE-4487) Reduce job latency by removing hardcoded sleep statements

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated MAPREDUCE-4487:
---------------------------------

    Attachment: MAPREDUCE-4487.patch

Here's a patch which removes sleeps (or improves their usage) in three places:

* In ReduceTask's fetchOutputs() if there are no map outputs in flight or scheduled, then it sleeps for five seconds. Replacing this condition with a wait that is notified when new map outputs become available is an improvement.
* In ReduceTask's fetchOutputs() when all the output has been fetched there is a join on GetMapEventsThread, which may be sleeping (for 1s). Replacing this with a wait/notify removes the sleep overhead.
* In Child's main loop while waiting for tasks from the parent tasktracker, the thread sleeps for 0.5s initially then 1.5s if there haven't been any tasks for a while. Replacing this with a more fine grained exponential backoff helps responsiveness.

I ran some tests to investigate the effect of these changes. I ran a sleep job that sleeps for 1ms ({{bin/hadoop jar hadoop-*examples*jar sleep -m 1 -r 1 -mt 1 -rt 1}}) and measured the job execution time (on a single node cluster). Without the patch the mean time was 12.97s (over 10 runs, sd 0.53), and with the patch it was 9.109s (sd 1.0) - a significant improvement.
                
> Reduce job latency by removing hardcoded sleep statements
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-4487
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4487
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1, performance
>    Affects Versions: 1.0.3
>            Reporter: Tom White
>            Assignee: Tom White
>         Attachments: MAPREDUCE-4487.patch
>
>
> There are a few places in MapReduce where there are hardcoded sleep statements. By replacing them with wait/notify or similar it's possible to reduce latency for short running jobs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira