flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Problems performance with FileChannel and HDFS Sink.
Date Tue, 02 Feb 2016 16:42:44 GMT
I don't know the internal details but I guess all those threads write to a
single file, so it will reach a point where there is no improvement.
On the other side having multiple sinks will create multiple files, which
should scale better but you need to make sure the files are written in
different folders or pattern, which could be an inconvenience having events
for the same period in multiple files.


On 2 February 2016 at 08:38, Guillermo Ortiz <konstt2000@gmail.com> wrote:

> Hello,
> I have some problems with the performance of HDFS Sink. I only have one
> sink and one file channel.
> I thought to increase the number of sinks for my channel, but I saw as
> well the parameter threadsPoolSize. What's the different between this
> parameter and create more sinks?
> I guess that it should be a group of sinks, but I read this in another
> thread:
> "You can add more sinks to your config.
> Don't put them in a sink group just have multiple sinks pulling from the
> same channel. This should increase your throughput." as answer to other
> question similar to mine.
> Could someone explain me a little bit this??

View raw message