Flume-users,
In the event of an HDFS failure, I would like to durably fail events over to
the local collector disk. To that end, I've configured a failover sink in
the following manner :
config [logicalNodeName, rpcSource(54002), < lazyOpen stubbornAppend
collector(60000)
{escapedCustomDfs("hdfs://namenode/user/flume/%Y-%m-%d","send-%{rolltag}")}
? diskFailover insistentOpen stubbornAppend collector(60000)
{escapedCustomDfs("hdfs://namenode/user/flume/%Y-%m-%d","send-%{rolltag}")}
>]
I mock an HDFS connection failure by setting the directory permissions
on /user/flume/%Y-%m-%d
to readonly while the events are streaming.
Examining the log in such a case, however, it looks that although the sink
keeps retrying HDFS per the backoff policy:
2011-10-16 23:25:19,375 INFO
com.cloudera.flume.handlers.debug.InsistentAppendDecorator: append attempt 9
failed, backoff (60000ms):
org.apache.hadoop.security.AccessControlException: Permission denied:
user=flume, access=WRITE
and a sequence failover file is created locally:
2011-10-16 23:25:20,644 INFO
com.cloudera.flume.handlers.hdfs.SeqfileEventSink: constructed new seqfile
event sink:
file=/tmp/flume-flume/agent/logicalNodeName/dfo_writing/20111016-232520644-0600.9362465244700638.00007977
2011-10-16 23:25:20,644 INFO
com.cloudera.flume.agent.diskfailover.NaiveFileFailoverManager: opening new
file for 20111016-232510634-0600.9362455234272014.00007977
The sequence file is, in fact, empty and events seem to be merely queued up
in memory rather than on disk.
Is this a valid use case? This might be overly cautious, but I would like
to persist events durably and prevent the logical node from queuing events
in memory in the event of HDFS connection failure.
|