You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-dev@hadoop.apache.org by "Todd Lipcon (Created) (JIRA)" <ji...@apache.org> on 2011/10/27 06:43:32 UTC

[jira] [Created] (MAPREDUCE-3278) 0.20: avoid a busy-loop in ReduceTask scheduling

0.20: avoid a busy-loop in ReduceTask scheduling
------------------------------------------------

                 Key: MAPREDUCE-3278
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3278
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: mrv1, performance, task
    Affects Versions: 0.20.205.0
            Reporter: Todd Lipcon
            Assignee: Todd Lipcon


Looking at profiling results, it became clear that the ReduceTask has the following busy-loop which was causing it to suck up 100% of CPU in the fetch phase in some configurations:
- the number of reduce fetcher threads is configured to more than the number of hosts
- therefore "busyEnough()" never returns true
- the "scheduling" portion of the code can't schedule any new fetches, since all of the pending fetches in the mapLocations buffer correspond to hosts that are already being fetched (the hosts are in the {{uniqueHosts}} map)
- {{getCopyResult()}} immediately returns null, since there are no completed maps.
Hence ReduceTask spins back and forth between trying to schedule things (and failing), and trying to grab completed results (of which there are none), with no waits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-3278) 0.20: avoid a busy-loop in ReduceTask scheduling

Posted by "Todd Lipcon (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved MAPREDUCE-3278.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.20.206.0
     Hadoop Flags: Reviewed

Committed to branch-0.20-security. Thanks, Eli.
                
> 0.20: avoid a busy-loop in ReduceTask scheduling
> ------------------------------------------------
>
>                 Key: MAPREDUCE-3278
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3278
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1, performance, task
>    Affects Versions: 0.20.205.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.20.206.0
>
>         Attachments: mr-3278.txt, reducer-cpu-usage.png
>
>
> Looking at profiling results, it became clear that the ReduceTask has the following busy-loop which was causing it to suck up 100% of CPU in the fetch phase in some configurations:
> - the number of reduce fetcher threads is configured to more than the number of hosts
> - therefore "busyEnough()" never returns true
> - the "scheduling" portion of the code can't schedule any new fetches, since all of the pending fetches in the mapLocations buffer correspond to hosts that are already being fetched (the hosts are in the {{uniqueHosts}} map)
> - {{getCopyResult()}} immediately returns null, since there are no completed maps.
> Hence ReduceTask spins back and forth between trying to schedule things (and failing), and trying to grab completed results (of which there are none), with no waits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira