flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Multiple agents in high availability
Date Thu, 24 Sep 2015 21:03:05 GMT
Set the same groupId in all the sources using the same topic.
Each message will be read just by one of them

On Sep 24, 2015 9:59 PM, "Carlos Rojas Matas" <cmatas@despegar.com> wrote:

> Hi Guys!
> Thanks for accepting my request. We're using flume to ingest massive
> amount of data from a kafka source and we're not sure about how to
> configure a flume cluster with HA. This is a brief:
> 1 - we use kafka to hold intermediate data about our users activity.
> 2- we use flume to ingest all that data and send it to avro files in hdfs.
> 3- we wan't to have high availability, that is, not a single agent but a
> cluster of agents.
> 4- the thing is that we cannot have duplicates in the target files. If we
> start several agents consuming from the same topic each one of them
> potentially could receive the same events, which breaks out the former
> constraint.
> Is there a way to configure multiple sources such that Kafka see them as a
> single one?
> Thanks in advance,
> -carlos.

View raw message