flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guyle M. Taber" <gu...@gmtech.net>
Subject Where to put the flume agents within a cluster
Date Fri, 23 Jun 2017 18:50:37 GMT
We have a 32 data node Hadoop cluster that receives incoming flume data via three data nodes
acting as flume agents. We’re using round robin DNS entries to spread incoming flume data
from various external architectures to the three flume agents on those three data nodes.

It seems like historically, the three data nodes that are the flume agents always have many
more blocks than other data nodes, so I’m wondering what the best approach for placement
of flume agents would be within a cluster. Should all data nodes in the cluster be flume nodes,
or should the flume agent be placed on a name node or other non-data node?

Thanks for any guidance.
View raw message