You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2009/06/05 09:06:07 UTC

[jira] Commented: (HADOOP-5090) The capacity-scheduler should assign multiple tasks per heartbeat

    [ https://issues.apache.org/jira/browse/HADOOP-5090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716526#action_12716526 ] 

Arun C Murthy commented on HADOOP-5090:
---------------------------------------

I'd strongly urge *against* assigning multiple reduces per task. When I did it HADOOP-3136 it caused _bad_ imbalances with reduces... for e.g. consider 2 jobs - one with 'small' reduces and other with 'heavy' reduces. If we assign multiple reduces then a portion of the cluster (tasktrackers) will run the 'small' reduces and the others will run 'heavy' reduces leading to bad imbalances in load on the machine. Given that we decided to assign only 1 reduce per heartbeat wiht HADOOP-3136 to achieve better load balance.

> The capacity-scheduler should assign multiple tasks per heartbeat
> -----------------------------------------------------------------
>
>                 Key: HADOOP-5090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5090
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Arun C Murthy
>            Assignee: Vinod K V
>            Priority: Critical
>         Attachments: HADOOP-5090-20090504.txt, HADOOP-5090-20090506.txt, HADOOP-5090-20090604.txt
>
>
> HADOOP-3136 changed the default o.a.h.mapred.JobQueueTaskScheduler to assign multiple tasks per TaskTracker heartbeat, the capacity-scheduler should do the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.