flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshan Naik <ros...@hortonworks.com>
Subject Re: Load Balancer
Date Mon, 23 Mar 2015 22:31:07 GMT
Might be good idea to make a note of this correlation between sink groups and threads in the
user guide.

From: Hari Shreedharan <hshreedharan@cloudera.com<mailto:hshreedharan@cloudera.com>>
Reply-To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Date: Tuesday, March 10, 2015 9:02 AM
To: "user@flume.apache.org<mailto:user@flume.apache.org>" <user@flume.apache.org<mailto:user@flume.apache.org>>
Subject: Re: Load Balancer

Most often, for HDFS, you'd need multiple sinks to drain the same channel than one with a
load balancer. Load balancing is mostly meant to send data from one set of Flume agents to
another set (tier 1 to 2), so that even if one agent goes down, the data still goes to the
others. Adding more sinks or sink groups adds multithreading - so your drain rates improve.
Each sink group gets a thread - the thread chooses one of the sinks to process the data from
the channel. So for multithreading + load balancing, you'd need several (possibly, identically
configured, but with different sinks) sink groups. Think of sinks outside of sink groups as
a sink group with just 1 sink.

j.guilmard@accenture.com<mailto:j.guilmard@accenture.com> wrote:


No they are not as far as I know. That makes it very important to know when designing Load
Balanced Flume Architecture: you must count on multiple agents, rather than multiple sinks

Best regards
Jean-Fran├žois Guilmard
Accenture Digital

Email: j.guilmard@accenture.com<mailto:j.guilmard@accenture.com>
Phone: +33 6 32 27 70 22

Accenture Analytics: Connect with us to learn more.

-----Original Message-----
From: Guillermo Ortiz [mailto:konstt2000@gmail.com]
Sent: mardi 10 mars 2015 12:15
To: user@flume.apache.org<mailto:user@flume.apache.org>
Subject: Load Balancer

I was checking the load balancing with a group of HDFS-Sink, are they multithreading?? I mean,
each sink of the group is an independent thread as when I have many sinks?
When should I use loadbalancer instance of independend sinks?


This message is for the designated recipient only and may contain privileged, proprietary,
or otherwise confidential information. If you have received it in error, please notify the
sender immediately and delete the original. Any other use of the e-mail by you is prohibited.
Where allowed by local law, electronic communications with Accenture and its affiliates, including
e-mail and instant messaging (including content), may be scanned by our systems for the purposes
of information security and assessment of internal compliance with Accenture policy.


View raw message