flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Blazer <gbla...@gmail.com>
Subject Re: /metrics
Date Thu, 23 Jul 2015 19:15:18 GMT
Is it even the right strategy to poll /metrics as a healthcheck? Are there
better alternative sources

On Thursday, July 23, 2015, iain wright <iainwrig@gmail.com> wrote:

> GC is a good idea. Was also thinking maybe there is a config management
> tool in your environment changing the modified time of the flume.properties
> file, causing flume to re-initialize, which takes the metrics down for a
> few seconds depending on startup time. That seems like a stretch though. I
> would definitely throw JMX monitoring on it to monitor JVM (or use the GC
> logs), and watch flume logs during the time the problem exists.
>
> Also ssh and try polling localhost:port/metrics at the time your
> monitoring system is unable to poll it.
>
> Anytime ive seen this in our enviornment its been OOM or re-intializing
>
>
> On Jul 23, 2015 9:09 AM, "Ashish" <paliwalashish@gmail.com
> <javascript:_e(%7B%7D,'cvml','paliwalashish@gmail.com');>> wrote:
>
>> I think the Flume Agent is up, since the issue is intermittent.
>> Whenever the issue is happening check the Flume Agent which you are
>> polling i.e. it's up and running and processing messages. If you
>> already have GC logs enabled, check if GC could be causing the freeze.
>> Nothing else comes is striking as of now, assuming the network is
>> good.
>>
>> On Thu, Jul 23, 2015 at 12:09 AM, George Blazer <gblazer@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gblazer@gmail.com');>> wrote:
>> > We poll metrics once a minute. It's pretty intermittent
>> >
>> > On Wednesday, July 22, 2015, iain wright <iainwrig@gmail.com
>> <javascript:_e(%7B%7D,'cvml','iainwrig@gmail.com');>> wrote:
>> >>
>> >> How often do you poll the metrics?
>> >> Have you checked flume logs?
>> >> Is flume starting up fine , then at some point not responding on
>> metrics,
>> >> then you do something to bring it back up?
>> >> Or is it intermitently not responsive but fixes itself?
>> >>
>> >> On Jul 22, 2015 5:49 PM, "George Blazer" <gblazer@gmail.com
>> <javascript:_e(%7B%7D,'cvml','gblazer@gmail.com');>> wrote:
>> >>>
>> >>> I use :5653/metrics endpoint as my Flume healthcheck, but very often
>> the
>> >>> healthcheck refuses connection, i.e. the server doesn't run.
>> >>>
>> >>> Is there anything I could look at?
>> >>>
>> >>> I'm using Flume 1.5.
>> >>>
>> >>> Thanks.
>>
>>
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>

Mime
View raw message