You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Avner BenHanoch (Updated) (JIRA)" <ji...@apache.org> on 2012/04/05 17:46:32 UTC

[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avner BenHanoch updated MAPREDUCE-4049:
---------------------------------------

    Attachment: test.diff
                src.tgz
                mapred-site.xml
                mapred.diff

Attached is fix for supporting shuffle consumer & shuffle provider plugins in Hadoop.

The fix is intended for review.
Two comments are already known to me:
 1. I changed several members of ReduceTask to be public for allowing easy access by ShuffleConsumerPlugin(s).  I will enhance that in future commits.

 2. ShuffleProviderPlugin has currently very few members.  Hence, currently, it doesn't support plugins that support security.  I will enhance that in future commits.

Technical info:
============
 1. The fix contains 2 new source files (that were splitted from the BIG ReduceTask.java file) and changes in few existing files (One of them is a unit test file).

 2. The fix is based on Hadoop 1.0.0.  After your comments I'll be glad to provide enhanced fix for future 1.x.y versions (as well as for 2.x)

 3. The fix is provided in two formats: a)  as 2 diff files (src & test),  and b) as source files in src.tgz

                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task, tasktracker
>    Affects Versions: 0.23.1, 1.0.1
>            Reporter: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>         Attachments: Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, mapred-site.xml, mapred.diff, src.tgz, test.diff
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira