flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asim Zafir <asim.za...@gmail.com>
Subject Re: Performance of Flume in production systems
Date Thu, 25 Sep 2014 09:40:21 GMT
It really depends but couple of questions before a proper suggestion can be
made. :

What kind of agent are you using in your pipeline sinking to HDFS?
Does your pipeline involves a collector?
What kind of channel you are using accross the data pipeline?
How frequently do you want to roll the flume events?
It will be helpful to see your data pipeline architecture before making a

Asim Zafir

On Wed, Sep 24, 2014 at 10:53 PM, Blade Liu <hafzcdcn@gmail.com> wrote:

> Hi,
> I'm going to deploy Flume in production systems, but a little worried
> about its performance in real-world environment. Could anyone tell me about
> Flume's actual performance in production environment? say, if Flume can
> deal with 20,000 events per second from a single source(and what about
> 100-200 sources with one final HDFS sink).
> In addition, to reach good performance of tens of thousands of events per
> second, how many servers(agents) should be used?  More agents(and more
> tiers), better performance?
> Thanks very much for your suggestions.
> Cheers,
> Blade

View raw message