You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Timothy St. Clair (JIRA)" <ji...@apache.org> on 2014/03/06 15:18:44 UTC

[jira] [Commented] (MESOS-1036) Implement a library for exposing statistical metrics.

    [ https://issues.apache.org/jira/browse/MESOS-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922550#comment-13922550 ] 

Timothy St. Clair commented on MESOS-1036:
------------------------------------------

Is the idea here to eventually enable reporting internally or expose via reporting plugin (say Nagios). 

> Implement a library for exposing statistical metrics.
> -----------------------------------------------------
>
>                 Key: MESOS-1036
>                 URL: https://issues.apache.org/jira/browse/MESOS-1036
>             Project: Mesos
>          Issue Type: Improvement
>          Components: statistics
>            Reporter: Benjamin Mahler
>            Assignee: Dominic Hamon
>
> At the current time, reporting of statistical metrics is dedicated to specific endpoints for each component, primarily the following two:
> {noformat}
> /master/stats.json
> /slave/stats.json
> {noformat}
> Additional endpoints have not been added (for example, containerization statistics, allocator statistics, libprocess statistics) due to the inherent difficulty involved: one must either expose this data up to these higher level endpoints, or add a new endpoint for exposing the component specific statistics.
> This is why the {{Statistics}} class in libprocess was created, however it is not being used for any statistical reporting at the current time.
> [~benjaminhindman] and I had white-boarded the kinds of abstractions we wanted to build to make statistical reporting trivial from anywhere in the code:
> Create the notion of a {{Statistic}} or {{Metric}} object that can be directly manipulated to store statistics, for example:
> {code}
> // In the Registrar initialization:
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Recording an individual storage latency.
> storage_latency.set(latency);
> {code}
> In addition to this, we wanted the notion of a {{Meter}}, which automatically exposes a metered version of a statistic, for example:
> {code}
> Metric storage_latency = statistics.create("registrar", "storage_latency");
> // Adds "storage_latency_average" which computes average over the window.
> statistics.meter(storage_latency, Average());
> // Adds a "storage_latency_p99", percentile is a non-trivial implementation.
> statistics.meter(registrar_storage_latency, Percentile(99));
> // Adds a "storage_latency_maximum"
> statistics.meter(registrar_storage_latency, Maximum());
> {code}
> Of course, I'm not advocating a particular API in the above examples, I'm just laying out the types of things we wanted to see available.
> As we add these types of abstractions, we will want to avoid storing large time series data in memory as is currently done in {{Statistics}}. There are a number of things to consider with respect to the windowing technique, but I think the notion of a window should transition from "amount of history to be kept" to "a statistical rolling window". For example, when computing an average, you would most likely want a rolling 1 minute average, as opposed to the average for a 2 week window.
> Efficiency of this library will be important to avoid high RSS overhead.



--
This message was sent by Atlassian JIRA
(v6.2#6252)