You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "David Robinson (JIRA)" <ji...@apache.org> on 2014/06/25 03:03:12 UTC

[jira] [Comment Edited] (AURORA-548) scheduler should always show tasks_lost_rack_XXX metrics

    [ https://issues.apache.org/jira/browse/AURORA-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042896#comment-14042896 ] 

David Robinson edited comment on AURORA-548 at 6/25/14 12:59 AM:
-----------------------------------------------------------------

Sure, but we only care about racks that have tasks, which the scheduler should always know. If there are no tasks scheduled on a rack then there's no tasks to lose.


was (Author: drobinson):
Sure, but we only need to care about racks that we're scheduler tasks on, which it should always know. If there are no tasks scheduled on a rack then there's no tasks to lose.

> scheduler should always show tasks_lost_rack_XXX metrics
> --------------------------------------------------------
>
>                 Key: AURORA-548
>                 URL: https://issues.apache.org/jira/browse/AURORA-548
>             Project: Aurora
>          Issue Type: Task
>          Components: Scheduler
>            Reporter: David Robinson
>
> The scheduler's /vars endpoint only exposes a tasks_lost_rack_XXX metric when tasks in a rack have been lost (a tasks_lost_rack_XXX key has a non-zero value). If no tasks in a rack have been lost then metrics for the rack are not exposed. This makes the metrics difficult to use for alerting purposes --  it's impossible to tell whether the rack does not exist or exists but has had no lost tasks. Each rack should have an entry in /vars regardless of whether there have been any lost tasks.
> Sample metrics:
> tasks_lost_rack_aab 3
> tasks_lost_rack_aae 4
> tasks_lost_rack_aah 2
> tasks_lost_rack_aai 3
> Expected metrics:
> tasks_lost_rack_aaa 0
> tasks_lost_rack_aab 3
> tasks_lost_rack_aac 0
> tasks_lost_rack_aad 0
> tasks_lost_rack_aae 4
> tasks_lost_rack_aaf 0
> tasks_lost_rack_aag 0
> tasks_lost_rack_aah 2
> tasks_lost_rack_aai 3



--
This message was sent by Atlassian JIRA
(v6.2#6252)