flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron Gandevia <cgande...@gmail.com>
Subject Re: flume dying on InterruptException (nanos)
Date Wed, 19 Oct 2011 20:42:40 GMT
We recently modified the RollSink to hide our problem by giving it a few
seconds to finish writing before rolling. We are going to test it out and if
it fixes our issue we will provide a patch later today.
On Oct 19, 2011 1:27 PM, "AD" <straightflush@gmail.com> wrote:

> Yea, i am using Hbase sink, so i guess its possible something is getting
> hung up there and causing the collector to die. The number of file
> descriptors seems more than safe under the limit.
>
> On Wed, Oct 19, 2011 at 3:16 PM, Cameron Gandevia <cgandevia@gmail.com>wrote:
>
>> We were seeing the same issue when our HDFS instance was overloaded and
>> taking over a second to respond. I assume if whatever backend is down the
>> collector will die and need to be restarted when it becomes available again?
>> Doesn't seem very reliable
>>
>>
>> On Wed, Oct 19, 2011 at 8:13 AM, Ralph Goers <ralph.goers@dslextreme.com>wrote:
>>
>>> We saw this problem when it was taking more than 1 second for a response
>>> from writing to Cassandra (our back end).  A single long response will kill
>>> the collector.  We had to revert back to the version of Flume that uses
>>> syncrhonization instead of read/write locking to get around this.
>>>
>>> Ralph
>>>
>>> On Oct 18, 2011, at 1:55 PM, AD wrote:
>>>
>>> > Hello,
>>> >
>>> >  My collector keeps dying with the following error, is this a known
>>> issue? Any idea how to prevent or find out what is causing it ?  is
>>> format("%{nanos}" an issue ?
>>> >
>>> > 2011-10-17 23:16:33,957 INFO
>>> com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode
>>> flume1-18 exited with error: null
>>> > java.lang.InterruptedException
>>> >       at
>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1246)
>>> >       at
>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1009)
>>> >       at
>>> com.cloudera.flume.handlers.rolling.RollSink.close(RollSink.java:296)
>>> >       at
>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67)
>>> >       at
>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67)
>>> >
>>> >
>>> > source:  collectorSource("35853")
>>> > sink:  regexAll("^([0-9.]+)\\s\\[([0-9a-zA-z\\/:
>>> -]+)\\]\\s([A-Z]+)\\s([a-zA-Z0-9.:]+)\\s\"([^\\s]+)\"\\s([0-9]+)\\s([0-9]+)\\s\"([^\\s]+)\"\\s\"([a-zA-Z0-9\\/()_
>>> -;]+)\"\\s(hit|miss)\\s([0-9.]+)","hbase_remote_host","hbase_request_date","hbase_request_method","hbase_request_host","hbase_request_url","hbase_response_status","hbase_response_bytes","hbase_referrer","hbase_user_agent","hbase_cache_hitmiss","hbase_origin_firstbyte")
>>> format("%{nanos}:") split(":", 0, "hbase_") format("%{node}:")
>>> split(":",0,"hbase_node") digest("MD5","hbase_md5") collector(10000) {
>>> attr2hbase("apache_logs","f1","","hbase_") }
>>>
>>>
>>
>>
>> --
>> Thanks
>>
>> Cameron Gandevia
>>
>
>

Mime
View raw message