You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Luke Lu (JIRA)" <ji...@apache.org> on 2012/12/11 19:55:21 UTC

[jira] [Commented] (MAPREDUCE-4586) Reduce large output segments directly from remote host

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529217#comment-13529217 ] 

Luke Lu commented on MAPREDUCE-4586:
------------------------------------

Separate namespace for these data (hence separate namenodes in HDFS2) would help as well. 
                
> Reduce large output segments directly from remote host
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-4586
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4586
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: task
>            Reporter: Chris Douglas
>
> For some jobs, copying large output segments to the local host is inefficient. The reduce can construct iterators on remote hosts, provided the stream is restartable. This should reduce task latency by amortizing the cost of the data transfer over the entire reduce, rather than paying it upfront.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira