You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2008/04/28 15:32:55 UTC

[jira] Commented: (HADOOP-3297) The way in which ReduceTask/TaskTracker gets completion events during shuffle can be improved

    [ https://issues.apache.org/jira/browse/HADOOP-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592806#action_12592806 ] 

Devaraj Das commented on HADOOP-3297:
-------------------------------------

Here is a proposal after a discussion with Sameer:
1) The TaskTracker polls the JobTracker asking for 500 task completion events. If it gets the full payload, it immediately asks for another bunch of 500 and so on. When it gets less than 500, it switches to current behavior - sleep for a fixed amount of time (heartbeat interval). A small number of events per RPC would ensure that each RPC takes a lesser amount of time although the number of RPCs would be more.
2) The Task asks for 10000 events at a time every second from the TaskTracker.

> The way in which ReduceTask/TaskTracker gets completion events during shuffle can be improved
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3297
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3297
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.18.0
>
>
> Certain things like poll frequency, number of events fetched in one go, etc. can probably be improved to improve the shuffle performance. This would affect the task-->tasktracker and the tasktracker-->jobtracker shuffle related RPCs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.