flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: multi-threaded elasticsearch sink
Date Wed, 19 Jun 2013 18:30:09 GMT
Technically, even the HDFS sink uses only one thread to write to HDFS. The Async Hbase Sink
writes using multiple threads (though they are hidden away from the sink itself - it is in
the underlying API).  


On Wednesday, June 19, 2013 at 11:17 AM, Roshan Naik wrote:

> take a look at hdfs sink.
> -roshan
> On Wed, Jun 19, 2013 at 8:00 AM, Allan Feid <allanfeid@gmail.com (mailto:allanfeid@gmail.com)>
> > I'm not that great at Java at the moment, but it appears that the single threaded
nature of the elasticsearch sink has trouble keeping up with ~5k events/second at 2k batch
size. It looks like the HDFS sink has the ability to run multiple threads that write to the
HDFS. I can get some performance increase by adding multiple ElasticSearch sinks to simulate
parallelism, but it would be great for the sink itself to support multiple threads.
> > 
> > Is there a sink example that should be used as a guide towards getting the same
features in the elasticsearch sink?
> > 
> > Thanks,
> > Allan
> > 
> > 

View raw message