You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Suresh V <ve...@gmail.com> on 2017/02/27 03:27:35 UTC

Alerts when Flume agent fails

Hello,

Is there a way to set up an alert mechanism by email immediately when a
flume agent fails due to any reason?

At the moment, we have scripts sending the port 41414 JSON metrics by email
every hour, but it would be good to know as soon as an agent fails.

Appreciate any help.

Thank you
Suresh.

Re: Alerts when Flume agent fails

Posted by Denes Arvay <de...@cloudera.com>.
Hi Suresh,

Sink:
- BatchCompleteCount
Number of processed "complete" batches where the number of events in the
batch reached the configured batch size.

- BatchUnderflowCount
Number of batches processed where the number of events is less than the
configured maximum batch size. This can happen when the channel becomes
empty and the already processed events will be flushed (and the transaction
will be committed) even though the batchsize hasn't been reached.

Source:
- AppendBatchReceivedCount and AppendBatchAcceptedCount:
If the source supports batching it keeps track of the number of the
received and processed batches respectively. (Processed means the events
were forwarded to the channel via the channelprocessor.)

- AppendReceivedCount and AppendAcceptedCount
This is the number of the received and processed single (not batched)
events respectively.

Please keep in mind that not all of the components support all the counters.

Regards,
Denes



> -Suresh.
>
>
> On Sun, Feb 26, 2017 at 10:40 PM, iain wright <ia...@gmail.com> wrote:
>
> metrics endpoint polling every 60s is probably the best, alert on nodata >
> N minutes or any non http 200 response
>
> alternatively you could use something like monit
> <https://mmonit.com/monit/> to monitor the process is running ,but this
> won't handle an OOM flume agent, in which case you'd need to add
> -XX:OnOutOfMemoryError="kill -9 %p", to make the sure the process being
> monitored dies when the jvm encounters OOM
>
> with metrics polling you get the added benefit of being able to detect
> pressure or problems before they bubble up into larger problems (IE:
> Channelsize increasing over N minutes, and successfulsinkcount not
> changing) i dont remember the exact names of the metrics it's been awhile
>
> the metric keys seemed to explain it well enough when i was using this in
> the past, are there any specific keys in the response from /metrics you
> don't understand?
>
> --
> Iain Wright
>
> This email message is confidential, intended only for the recipient(s)
> named above and may contain information that is privileged, exempt from
> disclosure under applicable law. If you are not the intended recipient, do
> not disclose or disseminate the message to anyone except the intended
> recipient. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender by return email, and
> delete all copies of this message.
>
> On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <ve...@gmail.com> wrote:
>
> Thank you.
>
> Additionally, where can I find details about each metric in the json
> output on port 41414? I could not find detailed description of each metric
> and what it means, from the user guide.
>
> Thank you
> Suresh.
>
>
> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sh...@gmail.com>
> wrote:
>
> Set up scripts to send alerts sooner ? There isn't a built in way in flume
> so you will have to setup monitoring separately
>
>
>
>
>
> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <ve...@gmail.com>
> wrote:
>
> Hello,
>
> Is there a way to set up an alert mechanism by email immediately when a
> flume agent fails due to any reason?
>
> At the moment, we have scripts sending the port 41414 JSON metrics by
> email every hour, but it would be good to know as soon as an agent fails.
>
> Appreciate any help.
>
> Thank you
> Suresh.
>
>
>
>
>

Re: Alerts when Flume agent fails

Posted by Suresh V <ve...@gmail.com>.
Thank you Iain. I'm looking for explanation on what the below metrics mean:

Sink:
 BatchCompleteCount
 BatchUnderflowCount

Source
 AppendBatchAcceptedCount
 AppendReceivedCount
 AppendAcceptedCount
 AppedBatchReceiedCount

-Suresh.


On Sun, Feb 26, 2017 at 10:40 PM, iain wright <ia...@gmail.com> wrote:

> metrics endpoint polling every 60s is probably the best, alert on nodata >
> N minutes or any non http 200 response
>
> alternatively you could use something like monit
> <https://mmonit.com/monit/> to monitor the process is running ,but this
> won't handle an OOM flume agent, in which case you'd need to add
> -XX:OnOutOfMemoryError="kill -9 %p", to make the sure the process being
> monitored dies when the jvm encounters OOM
>
> with metrics polling you get the added benefit of being able to detect
> pressure or problems before they bubble up into larger problems (IE:
> Channelsize increasing over N minutes, and successfulsinkcount not
> changing) i dont remember the exact names of the metrics it's been awhile
>
> the metric keys seemed to explain it well enough when i was using this in
> the past, are there any specific keys in the response from /metrics you
> don't understand?
>
> --
> Iain Wright
>
> This email message is confidential, intended only for the recipient(s)
> named above and may contain information that is privileged, exempt from
> disclosure under applicable law. If you are not the intended recipient, do
> not disclose or disseminate the message to anyone except the intended
> recipient. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender by return email, and
> delete all copies of this message.
>
> On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <ve...@gmail.com> wrote:
>
>> Thank you.
>>
>> Additionally, where can I find details about each metric in the json
>> output on port 41414? I could not find detailed description of each metric
>> and what it means, from the user guide.
>>
>> Thank you
>> Suresh.
>>
>>
>> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sh...@gmail.com>
>> wrote:
>>
>>> Set up scripts to send alerts sooner ? There isn't a built in way in
>>> flume so you will have to setup monitoring separately
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <ve...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>>
>>>> Is there a way to set up an alert mechanism by email immediately when a
>>>> flume agent fails due to any reason?
>>>>
>>>> At the moment, we have scripts sending the port 41414 JSON metrics by
>>>> email every hour, but it would be good to know as soon as an agent fails.
>>>>
>>>> Appreciate any help.
>>>>
>>>> Thank you
>>>> Suresh.
>>>>
>>>>
>>
>

Re: Alerts when Flume agent fails

Posted by iain wright <ia...@gmail.com>.
metrics endpoint polling every 60s is probably the best, alert on nodata >
N minutes or any non http 200 response

alternatively you could use something like monit <https://mmonit.com/monit/>
to monitor the process is running ,but this won't handle an OOM flume
agent, in which case you'd need to add -XX:OnOutOfMemoryError="kill -9 %p",
to make the sure the process being monitored dies when the jvm encounters
OOM

with metrics polling you get the added benefit of being able to detect
pressure or problems before they bubble up into larger problems (IE:
Channelsize increasing over N minutes, and successfulsinkcount not
changing) i dont remember the exact names of the metrics it's been awhile

the metric keys seemed to explain it well enough when i was using this in
the past, are there any specific keys in the response from /metrics you
don't understand?

-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <ve...@gmail.com> wrote:

> Thank you.
>
> Additionally, where can I find details about each metric in the json
> output on port 41414? I could not find detailed description of each metric
> and what it means, from the user guide.
>
> Thank you
> Suresh.
>
>
> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sh...@gmail.com>
> wrote:
>
>> Set up scripts to send alerts sooner ? There isn't a built in way in
>> flume so you will have to setup monitoring separately
>>
>>
>>
>>
>>
>> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <ve...@gmail.com>
>> wrote:
>>
>> Hello,
>>>
>>> Is there a way to set up an alert mechanism by email immediately when a
>>> flume agent fails due to any reason?
>>>
>>> At the moment, we have scripts sending the port 41414 JSON metrics by
>>> email every hour, but it would be good to know as soon as an agent fails.
>>>
>>> Appreciate any help.
>>>
>>> Thank you
>>> Suresh.
>>>
>>>
>

Re: Alerts when Flume agent fails

Posted by Suresh V <ve...@gmail.com>.
Thank you.

Additionally, where can I find details about each metric in the json output
on port 41414? I could not find detailed description of each metric and
what it means, from the user guide.

Thank you
Suresh.


On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sh...@gmail.com>
wrote:

> Set up scripts to send alerts sooner ? There isn't a built in way in flume
> so you will have to setup monitoring separately
>
>
>
>
>
> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <ve...@gmail.com>
> wrote:
>
> Hello,
>>
>> Is there a way to set up an alert mechanism by email immediately when a
>> flume agent fails due to any reason?
>>
>> At the moment, we have scripts sending the port 41414 JSON metrics by
>> email every hour, but it would be good to know as soon as an agent fails.
>>
>> Appreciate any help.
>>
>> Thank you
>> Suresh.
>>
>>

Re: Alerts when Flume agent fails

Posted by Sharninder Khera <sh...@gmail.com>.
Set up scripts to send alerts sooner ? There isn't a built in way in flume so you will have to setup monitoring separately 






On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <ve...@gmail.com> wrote:










Hello,

Is there a way to set up an alert mechanism by email immediately when a flume agent fails due to any reason? 

At the moment, we have scripts sending the port 41414 JSON metrics by email every hour, but it would be good to know as soon as an agent fails. 

Appreciate any help.

Thank you
Suresh.