You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Varun Vasudev (JIRA)" <ji...@apache.org> on 2015/04/07 16:56:12 UTC

[jira] [Updated] (YARN-3293) Track and display capacity scheduler health metrics in web UI

     [ https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Varun Vasudev updated YARN-3293:
--------------------------------
    Attachment: apache-yarn-3293.3.patch

{quote}
General - it looks like the counters could possibly overflow and provide negative values, perhaps this is not something which could possibly happen in the lifetime of a cluster, but a large long-running cluster, is it a possiblilty/concern?
{quote}
The counters in SchedulerHealth are Long so it should be fine. The counters in AssignmentInformation(new class I added) are reset every allocation cycle.

{quote}
This presently looks to be capasched only, had a suggestion to make slightly more general below, Vinod Kumar Vavilapalli also mentioned "not specific to scheduler", perhaps it's fine to go capasched only for the first iteration, but wanted to verify (perhaps we need a followon jira for other schedulers).
{quote}
Yes. That's the plan - once it's in for CapacityScheduler, I'll file a ticket to add the information for FairScheduler and point to this one as an example of the stuff we added.

{quote}
on the web page
It's a nit, but I find I don't like the look of the / between the counter and the resource expression where that occurs, maybe - instead of / for those (allocations/reservations/releases)?
{quote}
Fixed.

{quote}
TestSchedulerHealth
can we import Nodemanager & get rid of package references in code
{quote}
Fixed.

{quote}
CapacitySchedulerHealthInfo
looks like there is no need to keep a reference to the CapacityScheduler instance after construction, can we drop it from being a member then?
{quote}
Fixed.

{quote}
looks like line changes in info log are just whitespace, can you drop them?
{quote}
Fixed.

{quote}
LeafQueue
L884 looks to be just whitespace, can you revert?
{quote}
Fixed.

{quote}
CSAssignment
I think that there should be a new, gsharable between schedulers class which incorporates all the new assignment info and that it should be a member of CSAssignment, instead of adding all of the details directly to CSAssignment. You would still pack the info into CSAssignment (as an instance of that type), but now would take a form that can be shared across schedulers
{quote}
Fixed. I created a new class called AssignmentInformation which encapsulates everything.

> Track and display capacity scheduler health metrics in web UI
> -------------------------------------------------------------
>
>                 Key: YARN-3293
>                 URL: https://issues.apache.org/jira/browse/YARN-3293
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>            Reporter: Varun Vasudev
>            Assignee: Varun Vasudev
>         Attachments: Screen Shot 2015-03-30 at 4.30.14 PM.png, apache-yarn-3293.0.patch, apache-yarn-3293.1.patch, apache-yarn-3293.2.patch, apache-yarn-3293.3.patch
>
>
> It would be good to display metrics that let users know about the health of the capacity scheduler in the web UI. Today it is hard to get an idea if the capacity scheduler is functioning correctly. Metrics such as the time for the last allocation, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)