flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: performance
Date Wed, 07 Nov 2012 19:18:36 GMT
The channel is a passive component. It has no notion of blocking. All the Flume channels support
multiple transactions happening simultaneously - and none of these transactions block. If
a channel has no events to return, the take() method will simply return null. Multiple sinks
can pull events out of the same channel and none of them would block. When there are no events
available, the take() method returns null, and if the sink did not get any events at all in
that transaction, then the sink's process method should return BACKOFF, so that the sink runner
will wait for a few seconds before calling the process method again.


Hari Shreedharan

On Wednesday, November 7, 2012 at 11:08 AM, Nathaniel Auvil wrote:

> it is my understanding, perhaps incorrectly, that when you start a transaction in a sink,
the channel blocks until that transaction is committed.  Are you saying you can have multiple
sinks pulling simultaneously from a single channel and the transactional semantics will not
cause blocking?
> On Wed, Nov 7, 2012 at 2:03 PM, Hari Shreedharan <hshreedharan@cloudera.com (mailto:hshreedharan@cloudera.com)>
> > Hi Nathaniel, 
> > 
> > What do you mean single-threaded model? Almost all of Flume's components are multithreaded
- if you mean sink being driven by one thread - you can always add more sinks - and each one
will be driven by its own thread. If you want to write the same data to multiple locations
- just add more channels to the same source (thus replicating the data) and attach the sinks
as required - this will allow you to get data to multiple locations. If you want to write
to higher latency location, you an either add multiple sinks reading from the same channel
(thus creating multiple sink runners), or make your sink multithreaded (spawn multiple threads
inside the process method and then wait for all threads to succeed/fail), so more threads
do I/O.  
> > 
> > 
> > Hari
> > -- 
> > Hari Shreedharan
> > 
> > 
> > On Wednesday, November 7, 2012 at 10:48 AM, Nathaniel Auvil wrote:
> > 
> > > in addition to HDFS, i need to support sending events to a higher latency (network
related) target which in our current implementation mitigates by using more than one thread.
 The model for Flume is single threaded.  How do I support this with Flume?  multiplex over
n channels with a sink on each ? 
> > > 
> > > 
> > 
> > 

View raw message