You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Vinod K V (JIRA)" <ji...@apache.org> on 2009/05/29 09:02:45 UTC

[jira] Updated: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

     [ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5884:
------------------------------

    Attachment: HADOOP-5884-20090529.1.txt

The proposal is to track capacities and user-limits by the number of slots occupied by the tasks of a job instead of the number of running tasks.

Attaching patch implementing this. This patch has to be applied over the latest patch for HADOOP-5932. This patch does the following:
 - Modifies all the calculations of capacities and user-limits to be based on the number of slots occupied by running tasks of a job.
 - Retains number of running tasks for displaying on the UI
 - Adds test-cases to verify the number of slots accounted for high memory jobs by modifying the corresponding tests.
 - Adds test-cases to verify the newly added "occupied slots" in the scheduling information
 - Adds missing @override tags, removes stale imports and stale occurrences of gc (guarenteed capacity)


> Capacity scheduler should account high memory jobs as using more capacity of the queue
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5884
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>         Attachments: HADOOP-5884-20090529.1.txt
>
>
> Currently, when a high memory job is scheduled by the capacity scheduler, each task scheduled counts only once in the capacity of the queue, though it may actually be preventing other jobs from using spare slots on that node because of its higher memory requirements. In order to be fair, the capacity scheduler should proportionally (with respect to default memory) account high memory jobs as using a larger capacity of the queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.