You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Soumya Simanta <so...@gmail.com> on 2016/04/13 04:29:56 UTC

Monitoring and alerting mechanisms for Flink on YARN

We are about to deploy a Flink job on YARN in production. Given that it is
a long running process we want to have alerting and monitoring mechanisms
in place.

 Any existing solutions or suggestions to implement a new one would we
appreciated.

Thanks!

Re: Monitoring and alerting mechanisms for Flink on YARN

Posted by Stephan Ewen <se...@apache.org>.
There is also quite an ongoing effort to create and expose more Metrics via
JMX.

Parts of that is in the JIRA below, but there will be an additional
proposal and design pubshished in the next days.
https://issues.apache.org/jira/browse/FLINK-1502

On Fri, Apr 15, 2016 at 11:04 AM, Flavio Pompermaier <po...@okkam.it>
wrote:

> Very interesting! Could you please provide more details about its usage in
> your deployment?
>
> Thanks,
> Flavio
>
>
> On Thu, Apr 14, 2016 at 11:25 PM, Christian Kreutzfeldt <mn...@gmail.com>
> wrote:
>
>> Hi Soumya,
>>
>> we are using a StatsD / Graphite setup to extract metrics from our
>> running Flink applications. At least for alerting and monitoring based on
>> time series it works perfectly well. Just take a look at
>> https://github.com/tim-group/java-statsd-client which is widely deployed
>> in our source code.
>>
>> Best
>>   Christian Kreutzfeldt
>>
>>
>> 2016-04-13 4:29 GMT+02:00 Soumya Simanta <so...@gmail.com>:
>>
>>> We are about to deploy a Flink job on YARN in production. Given that it
>>> is a long running process we want to have alerting and monitoring
>>> mechanisms in place.
>>>
>>>  Any existing solutions or suggestions to implement a new one would we
>>> appreciated.
>>>
>>> Thanks!
>>>
>>
>>
>

Re: Monitoring and alerting mechanisms for Flink on YARN

Posted by Flavio Pompermaier <po...@okkam.it>.
Very interesting! Could you please provide more details about its usage in
your deployment?

Thanks,
Flavio

On Thu, Apr 14, 2016 at 11:25 PM, Christian Kreutzfeldt <mn...@gmail.com>
wrote:

> Hi Soumya,
>
> we are using a StatsD / Graphite setup to extract metrics from our running
> Flink applications. At least for alerting and monitoring based on time
> series it works perfectly well. Just take a look at
> https://github.com/tim-group/java-statsd-client which is widely deployed
> in our source code.
>
> Best
>   Christian Kreutzfeldt
>
>
> 2016-04-13 4:29 GMT+02:00 Soumya Simanta <so...@gmail.com>:
>
>> We are about to deploy a Flink job on YARN in production. Given that it
>> is a long running process we want to have alerting and monitoring
>> mechanisms in place.
>>
>>  Any existing solutions or suggestions to implement a new one would we
>> appreciated.
>>
>> Thanks!
>>
>
>

Re: Monitoring and alerting mechanisms for Flink on YARN

Posted by Christian Kreutzfeldt <mn...@gmail.com>.
Hi Soumya,

we are using a StatsD / Graphite setup to extract metrics from our running
Flink applications. At least for alerting and monitoring based on time
series it works perfectly well. Just take a look at
https://github.com/tim-group/java-statsd-client which is widely deployed in
our source code.

Best
  Christian Kreutzfeldt

2016-04-13 4:29 GMT+02:00 Soumya Simanta <so...@gmail.com>:

> We are about to deploy a Flink job on YARN in production. Given that it is
> a long running process we want to have alerting and monitoring mechanisms
> in place.
>
>  Any existing solutions or suggestions to implement a new one would we
> appreciated.
>
> Thanks!
>