flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh V <verdi...@gmail.com>
Subject Re: Alerts when Flume agent fails
Date Mon, 27 Feb 2017 12:35:40 GMT
Thank you Iain. I'm looking for explanation on what the below metrics mean:

Sink:
 BatchCompleteCount
 BatchUnderflowCount

Source
 AppendBatchAcceptedCount
 AppendReceivedCount
 AppendAcceptedCount
 AppedBatchReceiedCount

-Suresh.


On Sun, Feb 26, 2017 at 10:40 PM, iain wright <iainwrig@gmail.com> wrote:

> metrics endpoint polling every 60s is probably the best, alert on nodata >
> N minutes or any non http 200 response
>
> alternatively you could use something like monit
> <https://mmonit.com/monit/> to monitor the process is running ,but this
> won't handle an OOM flume agent, in which case you'd need to add
> -XX:OnOutOfMemoryError="kill -9 %p", to make the sure the process being
> monitored dies when the jvm encounters OOM
>
> with metrics polling you get the added benefit of being able to detect
> pressure or problems before they bubble up into larger problems (IE:
> Channelsize increasing over N minutes, and successfulsinkcount not
> changing) i dont remember the exact names of the metrics it's been awhile
>
> the metric keys seemed to explain it well enough when i was using this in
> the past, are there any specific keys in the response from /metrics you
> don't understand?
>
> --
> Iain Wright
>
> This email message is confidential, intended only for the recipient(s)
> named above and may contain information that is privileged, exempt from
> disclosure under applicable law. If you are not the intended recipient, do
> not disclose or disseminate the message to anyone except the intended
> recipient. If you have received this message in error, or are not the named
> recipient(s), please immediately notify the sender by return email, and
> delete all copies of this message.
>
> On Sun, Feb 26, 2017 at 7:37 PM, Suresh V <verditer@gmail.com> wrote:
>
>> Thank you.
>>
>> Additionally, where can I find details about each metric in the json
>> output on port 41414? I could not find detailed description of each metric
>> and what it means, from the user guide.
>>
>> Thank you
>> Suresh.
>>
>>
>> On Sun, Feb 26, 2017 at 9:33 PM, Sharninder Khera <sharninder@gmail.com>
>> wrote:
>>
>>> Set up scripts to send alerts sooner ? There isn't a built in way in
>>> flume so you will have to setup monitoring separately
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 27, 2017 at 8:57 AM +0530, "Suresh V" <verditer@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>>
>>>> Is there a way to set up an alert mechanism by email immediately when a
>>>> flume agent fails due to any reason?
>>>>
>>>> At the moment, we have scripts sending the port 41414 JSON metrics by
>>>> email every hour, but it would be good to know as soon as an agent fails.
>>>>
>>>> Appreciate any help.
>>>>
>>>> Thank you
>>>> Suresh.
>>>>
>>>>
>>
>

Mime
View raw message