You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Arun C Murthy (Updated) (JIRA)" <ji...@apache.org> on 2012/02/23 16:18:49 UTC

[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks

     [ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated MAPREDUCE-3902:
-------------------------------------

    Attachment: MAPREDUCE-3902.patch

Ok, I spent a long (isolated) flight on this - it clearly needs more work, but it's a start. *smile*

This patch improves the classic JVM re-use on both dimensions described in the jira.

We need to pay more attention to the user interface, some options:
# Allow user to specify actual number of map slots to be used (supported now, in the patch)
# Allow user to specify a target block-size for maps (which is greater than real HDFS block size) i.e. get around the small-files problem.

Thoughts?
                
> MR AM should reuse containers for map tasks
> -------------------------------------------
>
>                 Key: MAPREDUCE-3902
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>         Attachments: MAPREDUCE-3902.patch
>
>
> The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner:
> # Consider data-locality when re-using containers
> # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira