flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From iain wright <iainw...@gmail.com>
Subject Re: Alerts when Flume agent fails
Date Mon, 27 Feb 2017 04:40:43 GMT
metrics endpoint polling every 60s is probably the best, alert on nodata >
N minutes or any non http 200 response

alternatively you could use something like monit <https://mmonit.com/monit/>
to monitor the process is running ,but this won't handle an OOM flume
agent, in which case you'd need to add -XX:OnOutOfMemoryError="kill -9 %p",
to make the sure the process being monitored dies when the jvm encounters
OOM

with metrics polling you get the added benefit of being able to detect
pressure or problems before they bubble up into larger problems (IE:
Channelsize increasing over N minutes, and successfulsinkcount not
changing) i dont remember the exact names of the metrics it's been awhile

the metric keys seemed to explain it well enough when i was using this in
the past, are there any specific keys in the response from /metrics you
don't understand?

-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <verditer@gmail.com> wrote:

> Thank you.
>
> Additionally, where can I find details about each metric in the json
> output on port 41414? I could not find detailed description of each metric
> and what it means, from the user guide.
>
> Thank you
> Suresh.
>
>
> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sharninder@gmail.com>
> wrote:
>
>> Set up scripts to send alerts sooner ? There isn't a built in way in
>> flume so you will have to setup monitoring separately
>>
>>
>>
>>
>>
>> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <verditer@gmail.com>
>> wrote:
>>
>> Hello,
>>>
>>> Is there a way to set up an alert mechanism by email immediately when a
>>> flume agent fails due to any reason?
>>>
>>> At the moment, we have scripts sending the port 41414 JSON metrics by
>>> email every hour, but it would be good to know as soon as an agent fails.
>>>
>>> Appreciate any help.
>>>
>>> Thank you
>>> Suresh.
>>>
>>>
>

Mime
View raw message