flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Sarva <csa...@evidon.com>
Subject Re: HDFS Failover sink
Date Mon, 17 Oct 2011 19:52:41 GMT
The best practice approach to handling this type of failure is to do it on
the agent where the event is being generated using the agentSink
(agentE2ESink or agentE2EChain) connected to a collectorSource/Sink which
then writes to HDFS. This will cause your events to be written on the agent
node. See section 4.1 in the user guide for more info:

http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_using_default_values

On Mon, Oct 17, 2011 at 10:37 AM, Michael Luban <michael.luban@gmail.com>wrote:

> Flume-users,
>
> In the event of an HDFS failure, I would like to durably fail events over
> to the local collector disk.  To that end, I've configured a failover sink
> in the following manner :
>
> config [logicalNodeName, rpcSource(54002), < lazyOpen stubbornAppend
> collector(60000)
> {escapedCustomDfs("hdfs://namenode/user/flume/%Y-%m-%d","send-%{rolltag}")}
> ? diskFailover insistentOpen stubbornAppend collector(60000)
> {escapedCustomDfs("hdfs://namenode/user/flume/%Y-%m-%d","send-%{rolltag}")}
> >]
>
> I mock an HDFS connection failure by setting the directory permissions on /user/flume/%Y-%m-%d
> to readonly while the events are streaming.
>
> Examining the log in such a case, however, it looks that although the sink
> keeps retrying HDFS per the backoff policy:
>
> 2011-10-16 23:25:19,375 INFO
> com.cloudera.flume.handlers.debug.InsistentAppendDecorator: append attempt 9
> failed, backoff (60000ms):
> org.apache.hadoop.security.AccessControlException: Permission denied:
> user=flume, access=WRITE
>
> and a sequence failover file is created locally:
>
> 2011-10-16 23:25:20,644 INFO
> com.cloudera.flume.handlers.hdfs.SeqfileEventSink: constructed new seqfile
> event sink:
> file=/tmp/flume-flume/agent/logicalNodeName/dfo_writing/20111016-232520644-0600.9362465244700638.00007977
> 2011-10-16 23:25:20,644 INFO
> com.cloudera.flume.agent.diskfailover.NaiveFileFailoverManager: opening new
> file for 20111016-232510634-0600.9362455234272014.00007977
>
> The sequence file is, in fact, empty and events seem to be merely queued up
> in memory rather than on disk.
>
> Is this a valid use case?  This might be overly cautious, but I would like
> to persist events durably and prevent the logical node from queuing events
> in memory in the event of HDFS connection failure.
>
>
>

Mime
View raw message