You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2007/05/30 01:16:15 UTC

[jira] Commented: (HADOOP-939) No-sort optimization

    [ https://issues.apache.org/jira/browse/HADOOP-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499965 ] 

Owen O'Malley commented on HADOOP-939:
--------------------------------------

Doug Judd,
    Has the recent change to support reduces = 0 addressed your need? If you set the number of reduces to 0, the output collector is fed directly from the Mapper output. If the map output is already sorted this saves all of the costs associated with the shuffle and the distributed sort.

> No-sort optimization
> --------------------
>
>                 Key: HADOOP-939
>                 URL: https://issues.apache.org/jira/browse/HADOOP-939
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>         Environment: all
>            Reporter: Doug Judd
>
> There should be a way to tell the mapred framework that the output of the map() phase will already be sorted.  The Reduce phase can just merge the intermediate files together without sorting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.