You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Vinod K V (JIRA)" <ji...@apache.org> on 2009/06/02 13:52:07 UTC

[jira] Updated: (HADOOP-5884) Capacity scheduler should account high memory jobs as using more capacity of the queue

     [ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod K V updated HADOOP-5884:
------------------------------

    Attachment: HADOOP-5884-20090602.1.txt

Updated patch incorporating all the above review comments except one:
 - Removed running tasks information from the UI. As of now, we are trying to avoid absolute numbers because of possible inconsistency between scheduler's information and cluster status. And, specifying running tasks as a percentage of total cluster capacity doesn't make sense now with each task possibly occupying multiple slots. The correct fix is to print absolute numbers after removing any inconsisteny possible. Hence pushing this to another follow-up jira issue.

@Arun
bq. Can we also add the number of slots to the UI?
I didn't get this. Do you mean number of slots per job being displayed in job-scheduling information? We are already displaying the number of slots used by a queue as percentage.

If you meant the first, I already considered this, but let it go for another jira. The job scheduling information is being displayed on the jobtracker ui first page and it looked ugly when it spanned multiple lines. I think it would be good if we can remove job scheduling information from the first page. But as that might trigger discussion, I've decided to leave it for now.

bq.Long term - we really should fix TestCapacityScheduler to not check strings and use relevant apis (even package-private ones).
Agree, even I could realize the pain while modifying testcases, but decide to postpone it for another jira as it is slightly tricky.


> Capacity scheduler should account high memory jobs as using more capacity of the queue
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5884
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>         Attachments: HADOOP-5884-20090529.1.txt, HADOOP-5884-20090602.1.txt
>
>
> Currently, when a high memory job is scheduled by the capacity scheduler, each task scheduled counts only once in the capacity of the queue, though it may actually be preventing other jobs from using spare slots on that node because of its higher memory requirements. In order to be fair, the capacity scheduler should proportionally (with respect to default memory) account high memory jobs as using a larger capacity of the queue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.