You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/05/25 22:25:30 UTC
[jira] Commented: (HADOOP-254) use http to shuffle data between the
maps and the reduces
[ http://issues.apache.org/jira/browse/HADOOP-254?page=comments#action_12413298 ]
Doug Cutting commented on HADOOP-254:
-------------------------------------
This looks great!
A couple of improvements:
1. in MapOutputLocation.getFile(), shouldn't things be closed in a 'finally' clause?
2. does MapOutputFile still need to be a Writable? I don't think so. We should remove its write & readFields implementations & any other methods that are no longer called.
3. do we have any way to detect when map outputs are lost or corrupted? that was a useful mechanism that i'd hate to lose.
4. Sameer promised that you'd remove RPC.callRaw() in this patch.
> use http to shuffle data between the maps and the reduces
> ---------------------------------------------------------
>
> Key: HADOOP-254
> URL: http://issues.apache.org/jira/browse/HADOOP-254
> Project: Hadoop
> Type: Improvement
> Components: mapred
> Versions: 0.2.1
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
> Fix For: 0.3
> Attachments: http-shuffle.patch
>
> To speed up the shuffle time, I'll use http (via the task tracker's jetty server) to send the map outputs.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira