You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@aurora.apache.org by "De, Bipra" <bi...@paypal.com> on 2017/08/22 22:23:19 UTC

How to fetch historical data for Aurora SLA metrics like MTTA, MTTS and MTTR?

Hello Friends,

Greetings!!

We are currently using Aurora 0.17.0 and have a use-case wherein we want to continuously monitor the below SLA metrics for our clusters to detect any anomalies :

  *   Median Time To Assigned (MTTA<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-assigned-(mtta)>)
  *   Median Time To Starting (MTTS<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-starting-(mtts)>)
  *   Median Time To Running (MTTR<http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-running-(mttr)>)

Currently, the sla_stat_refresh_interval for us is set to default 1 min.

Now, while using the /vars api endpoint to fetch the SLA metrics, aurora samples the data for metrics calculation of the above metrics only for the last one min at every 1 minute interval. It won’t give us the historical data for these metrics.

Does aurora expose any api endpoint to provide the historical data for these metrics over some configurable period of time? Is there any metric in /graphview endpoint for this?

Also, it will be great if anyone can suggest some ideas for monitoring around these metrics. I am , at present,  planning to keep polling the /vars endpoint regularly for data collection and use ELK stack for graphing and alerting.

Thanks for your time in advance !!

Regards,
Bipra.

Re: How to fetch historical data for Aurora SLA metrics like MTTA, MTTS and MTTR?

Posted by David McLaughlin <dm...@apache.org>.
We do the same thing at Twitter, we have a local agent that polls the the
metrics endpoint and sends them to our internal timeseries database.

Cheers,
David

On Tue, Aug 22, 2017 at 4:05 PM, Derek Slager <de...@gmail.com> wrote:

> Our approach is similar. We poll /vars.json on an interval and send a
> selection of those metrics to Riemann. We configure alerts there, and also
> pass these metrics through to InfluxDB for historical reporting (mostly via
> Grafana dashboards). This has worked well for us.
>
> --
> Derek Slager
> CTO
> Amperity
>
> On Tue, Aug 22, 2017 at 3:23 PM, De, Bipra <bi...@paypal.com> wrote:
>
>> Hello Friends,
>>
>> Greetings!!
>>
>> We are currently using *Aurora 0.17.0* and have a use-case wherein we
>> want to continuously monitor the below SLA metrics for our clusters to
>> detect any anomalies :
>>
>>    - Median Time To Assigned (MTTA
>>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-assigned-(mtta)>
>>    )
>>    - Median Time To Starting (MTTS
>>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-starting-(mtts)>
>>    )
>>    - Median Time To Running (MTTR
>>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-running-(mttr)>
>>    )
>>
>> Currently, the *sla_stat_refresh_interval* for us is set to default *1
>> min*.
>>
>> Now, while using the */vars* api endpoint to fetch the SLA metrics,
>> aurora samples the data for metrics calculation of the above metrics only
>> for the last one min at every 1 minute interval. It won’t give us the
>> historical data for these metrics.
>>
>> Does aurora expose any api endpoint to provide the historical data for
>> these metrics over some configurable period of time? Is there any metric in */graphview
>> *endpoint for this?
>>
>> Also, it will be great if anyone can suggest some ideas for monitoring
>> around these metrics. I am , at present,  planning to keep polling the
>> /vars endpoint regularly for data collection and use ELK stack for graphing
>> and alerting.
>>
>> Thanks for your time in advance !!
>>
>> Regards,
>>
>> Bipra.
>>
>
>

Re: How to fetch historical data for Aurora SLA metrics like MTTA, MTTS and MTTR?

Posted by Derek Slager <de...@gmail.com>.
Our approach is similar. We poll /vars.json on an interval and send a
selection of those metrics to Riemann. We configure alerts there, and also
pass these metrics through to InfluxDB for historical reporting (mostly via
Grafana dashboards). This has worked well for us.

--
Derek Slager
CTO
Amperity

On Tue, Aug 22, 2017 at 3:23 PM, De, Bipra <bi...@paypal.com> wrote:

> Hello Friends,
>
> Greetings!!
>
> We are currently using *Aurora 0.17.0* and have a use-case wherein we
> want to continuously monitor the below SLA metrics for our clusters to
> detect any anomalies :
>
>    - Median Time To Assigned (MTTA
>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-assigned-(mtta)>
>    )
>    - Median Time To Starting (MTTS
>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-starting-(mtts)>
>    )
>    - Median Time To Running (MTTR
>    <http://aurora.apache.org/documentation/latest/features/sla-metrics/#median-time-to-running-(mttr)>
>    )
>
> Currently, the *sla_stat_refresh_interval* for us is set to default *1
> min*.
>
> Now, while using the */vars* api endpoint to fetch the SLA metrics,
> aurora samples the data for metrics calculation of the above metrics only
> for the last one min at every 1 minute interval. It won’t give us the
> historical data for these metrics.
>
> Does aurora expose any api endpoint to provide the historical data for
> these metrics over some configurable period of time? Is there any metric in */graphview
> *endpoint for this?
>
> Also, it will be great if anyone can suggest some ideas for monitoring
> around these metrics. I am , at present,  planning to keep polling the
> /vars endpoint regularly for data collection and use ELK stack for graphing
> and alerting.
>
> Thanks for your time in advance !!
>
> Regards,
>
> Bipra.
>