You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Tsuyoshi OZAWA (JIRA)" <ji...@apache.org> on 2012/09/03 16:25:09 UTC

[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447292#comment-13447292 ] 

Tsuyoshi OZAWA commented on MAPREDUCE-3902:
-------------------------------------------

Thanks for your enumerating remaining tasks, Siddharth. I'll support you as far as possible. 

And I've not yet explained you the relationship between container-reuse work and MAPREDUCE-4502, so it may confuse you. I'm sorry for the short of explanation. I'll give it to you briefly. I'm planning to implement MAPREDUCE-4502 and MAPREDUCE-4525 with container-reuse implementation, because MRAppMaster in container-reuse implementation has the feature to monitor whether the running tasks on the containers are "the last task at a machine or not", for the purpose of exiting JVMs on containers, as you know. This feature is very similar to monitor task progress per containers, for the purpose of starting to run combiner for multi-level aggregation (MAPREDUCE-4502 and MAPREDUCE-4525).
                                                                                                                                                                                    
The description here is not documented, so I'll write down my thought as the design note for MAPREDUCE-4502 and MAPREDUCE-4525 within next one week. I'm very appreciate if you review it.
                                                                                                                                                                                    
                                                                                                                                                                                    
Thanks,                                                                                                                                                                             
Tsuyoshi          
                
> MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3902
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Arun C Murthy
>            Assignee: Siddharth Seth
>         Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch
>
>
> The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner:
> # Consider data-locality when re-using containers
> # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps)  : MAPREDUCE-4525 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira