flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 김동경 <style9...@gmail.com>
Subject Message Loss problem and performance requirements
Date Fri, 10 Apr 2015 04:24:45 GMT
Hello.

I wanna talk about message Loss problem and performance restriction in
Flume.

First of all, I want to ask you like this, does file channel meet your
performance requirement?
As far as I know, until flume-1.5.0, file channel is the only way to
resolve message loss problem in flume.
I performed brief benchmark test using file channel and got about 1000~2000
throughput per one agent.
If I conjunct multiple Flumes; two or three step wise, performance will be
more less.

I know it depends on the hardware specification.
However, no matter how I improve my hardware, it`s looks not easy to meet
my performance requirements.
I need at least 100K ~ 200K E.P.S(events per seconds).
(I assume that not to use SSD, since I think it`s not commodity hardware.)

I think powerful feature of Flume is dynamic events routing using dynamic
configuration reloading.
And multi-steps Flume agents can maximize this feature.
But multi-agent Flume deployments using file channel degrades the
performance.

Memory channel is quite enough for my performance requirement,
but definitely it has the possibility of message loss.

As of now I am considering to use KafkaChannel which is introduced in
Flume-1.6.0.
But I wanna know how the trade-offs in flume; performance requirements and
message loss; are addressed in many other systems all around the world.

Do you have any idea?

Thanks
Best regards
Dongkyoung.

Mime
View raw message