You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "YangLai (JIRA)" <ji...@apache.org> on 2009/12/05 02:51:21 UTC

[jira] Commented: (MAPREDUCE-956) Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)

    [ https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786303#action_12786303 ] 

YangLai commented on MAPREDUCE-956:
-----------------------------------

I have a scenario that the output of shuffle phase is exact what I want, so the sort phase and reduce phase are not necessary to me and cause a lot of overheads. I dont know how get the output of shuffle phase in hadoop 0.19.1 or 0.20.1. Maybe the sort phase should be optional to developers.

> Shuffle should be broken down to only two phases (copy/reduce) instead of three (copy/sort/reduce)
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Jothi Padmanabhan
>
> For the progress calculations and displaying on the UI, shuffle, in its current form,  is decomposed into three phases (copy/sort/reduce). Actually, the sort phase is no longer applicable. I think we should just reduce the number of phases to two and assign 50% weight-age to each of copy and reduce phases. Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.