You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@aurora.apache.org by "De, Bipra" <bi...@paypal.com> on 2018/02/01 23:15:27 UTC

Detecting Flapping Tasks in Aurora

Hello Everyone,

I am working on an alert system that will call Aurora APIs to detect jobs that have flapping tasks. It runs every hour.

Any suggestions on how to detect such jobs that have tasks flapping, provided those tasks were submitted to aurora as part of the same request. This is to filter out the cases where a user tries to submit a job multiple times but each time it failed.

Also, do we have a telegraf plugin for Aurora?

Regards,
Bipra.

Re: Detecting Flapping Tasks in Aurora

Posted by Meghdoot bhattacharya <me...@yahoo.com>.
Nice.

> On Feb 2, 2018, at 7:12 AM, Mauricio Garavaglia <ma...@gmail.com> wrote:
> 
> 
> 
>> On Thu, Feb 1, 2018 at 8:15 PM, De, Bipra <bi...@paypal.com> wrote:
>> Hello Everyone,
>> 
>>  
>> 
>> I am working on an alert system that will call Aurora APIs to detect jobs that have flapping tasks. It runs every hour.
>> 
>>  
>> 
>> Any suggestions on how to detect such jobs that have tasks flapping, provided those tasks were submitted to aurora as part of the same request. This is to filter out the cases where a user tries to submit a job multiple times but each time it failed.
>> 
>>  
>> 
>> Also, do we have a telegraf plugin for Aurora?
>> 
> 
> You can use https://github.com/medallia/telegraf/blob/master/plugins/inputs/aurora/aurora.go as a base to export the metrics you are interested in.
> 
>>  
>> 
>> Regards,
>> 
>> Bipra.
>> 
> 

Re: Detecting Flapping Tasks in Aurora

Posted by Mauricio Garavaglia <ma...@gmail.com>.
On Thu, Feb 1, 2018 at 8:15 PM, De, Bipra <bi...@paypal.com> wrote:

> Hello Everyone,
>
>
>
> I am working on an alert system that will call Aurora APIs to detect jobs
> that have flapping tasks. It runs every hour.
>
>
>
> Any suggestions on how to detect such jobs that have tasks flapping,
> provided those tasks were submitted to aurora as part of the same request.
> This is to filter out the cases where a user tries to submit a job multiple
> times but each time it failed.
>
>
>
> Also, do we have a telegraf plugin for Aurora?
>

You can use
https://github.com/medallia/telegraf/blob/master/plugins/inputs/aurora/aurora.go
as a base to export the metrics you are interested in.


>
> Regards,
>
> Bipra.
>

Re: Detecting Flapping Tasks in Aurora

Posted by Bill Farner <wf...@apache.org>.
You could scan for tasks that are in, or have been in the THROTTLED
<https://github.com/apache/aurora/blob/c746452e5fa1bc49da701b059fe69898b9b8a15c/docs/reference/task-lifecycle.md#natural-termination-finished-failed>
state.  You can adjust the time intervals for throttled tasks with these
scheduler args
<https://github.com/apache/aurora/blob/80139da4624916e406c7e80c4ea2d286d4d859c3/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L48-L60>
.

Also, do we have a telegraf plugin for Aurora?


Not that i'm aware of.  Let me know if you need any pointers with how stats
are exported from Aurora to do this.

On Thu, Feb 1, 2018 at 3:15 PM, De, Bipra <bi...@paypal.com> wrote:

> Hello Everyone,
>
>
>
> I am working on an alert system that will call Aurora APIs to detect jobs
> that have flapping tasks. It runs every hour.
>
>
>
> Any suggestions on how to detect such jobs that have tasks flapping,
> provided those tasks were submitted to aurora as part of the same request.
> This is to filter out the cases where a user tries to submit a job multiple
> times but each time it failed.
>
>
>
> Also, do we have a telegraf plugin for Aurora?
>
>
>
> Regards,
>
> Bipra.
>