flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ed <edor...@gmail.com>
Subject How to improve File Channel performance with multiple sinks
Date Mon, 06 Jan 2014 14:22:23 GMT
I saw in some previous list mail that one can improve FileChannel
performance when writing to HDFS by using multiple sinks  (
http://mail-archives.apache.org/mod_mbox/flume-user/201212.mbox/%3C8A87C252755D4F0AB9F5F17D6A7FA9D6@cloudera.com%3E).
 Initially I thought that this meant I should use a sink group to have
multiple sinks writing to HDFS.  However, I read in another thread that in
a sink group you still only have one sink active at a time  (can't find
that message at the moment).

If that's the case how do you setup multiple sinks to improve FileChannel
performance?  Is it as simple as assigning both sinks to the same
FileChannel like this::

log.sinks = hdfsSink1 hdfsSink2
log.channels = fileChannel
log.sinks.hdfsSink1.channel = fileChannel
log.sinks.hdfsSink2.channel = fileChannel
#assume hdfsSink1 and hdfsSink2 write to different directories

I feel like the above would replicate data in HDFS or does the channel send
each event to only one of the two possible sinks?

Thank you for your assistance!

Best Regards,

Ed

Mime
View raw message