flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asim Zafir <asim.za...@gmail.com>
Subject flume capacity planning
Date Tue, 14 Jan 2014 08:48:15 GMT

I have 50 webservers  that are pushing data at  500Mbits/sec via Flume to

(i)                  What is the minimum virtual memory required on the
websevers and  NameNode (assuming this is a direct sync to HDFS and no
Collector involved)

(ii)                In the second case, lets assume that there is a Flume
Collector that is sitting in between the webservers and HDFS Cluster and
instead of direct RPC connection from the webservers to HDFS cluster, the
flume collector receives the packets and then transits it to HDFS – what
kind of virtual memory and hardware specification required on the Flume
Collector, Webserver and the NameNode

(iii) can webserver push traffic accross WAN to a remote HDFS cluster
seperate by RTT factor 150ms without Flume Collector?

I will appreciate if you can get me this info as earliest as possible.


View raw message