You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Asim Zafir <as...@gmail.com> on 2014/01/14 09:48:15 UTC

flume capacity planning

Hi,


I have 50 webservers  that are pushing data at  500Mbits/sec via Flume to
HDFS



(i)                  What is the minimum virtual memory required on the
websevers and  NameNode (assuming this is a direct sync to HDFS and no
Collector involved)

(ii)                In the second case, lets assume that there is a Flume
Collector that is sitting in between the webservers and HDFS Cluster and
instead of direct RPC connection from the webservers to HDFS cluster, the
flume collector receives the packets and then transits it to HDFS – what
kind of virtual memory and hardware specification required on the Flume
Collector, Webserver and the NameNode

(iii) can webserver push traffic accross WAN to a remote HDFS cluster
seperate by RTT factor 150ms without Flume Collector?



I will appreciate if you can get me this info as earliest as possible.


Asim