You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Shadi A. Noghabi (JIRA)" <ji...@apache.org> on 2016/03/24 21:40:25 UTC

[jira] [Updated] (SAMZA-912) Getting a cumulative distribution function of the processes latency as a metric

     [ https://issues.apache.org/jira/browse/SAMZA-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shadi A. Noghabi updated SAMZA-912:
-----------------------------------
    Attachment: add-processNS-CDF.patch

> Getting a cumulative distribution function of the processes latency  as a metric
> --------------------------------------------------------------------------------
>
>                 Key: SAMZA-912
>                 URL: https://issues.apache.org/jira/browse/SAMZA-912
>             Project: Samza
>          Issue Type: New Feature
>            Reporter: Shadi A. Noghabi
>            Priority: Minor
>         Attachments: add-processNS-CDF.patch
>
>
> The processNs metrics reports the average of the processNS values over a length of time. However, there are some outliers in the data that heavily skew the data. The tail and 99th percentile is very large and affect the average. For example in one of my test, the processNs was reporting 300 us however, when I looked at the data unto the 95th percentile (i.e., remove all the dat points above 95th percentile), the average would be 7 us.
> I suggest reporting the cumulative distribution function (CDF) of the process ns  , for cases that would dig more into the detail distribution of the latencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)