flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Deal with duplicates in Flume with a crash.
Date Wed, 03 Dec 2014 16:44:48 GMT
There's nothing built into Flume to deal with duplicates, it only
provides at-least-once delivery semantics.

You'll have to handle it in your data processing applications or add
an ETL step to deal with duplicates before making data available for
other queries.


On Wed, Dec 3, 2014 at 5:46 AM, Guillermo Ortiz <konstt2000@gmail.com> wrote:
> Hi,
> I would like to know if there's a easy way to deal with data
> duplication when an agent crashs and it resends same data again.
> Is there any mechanism to deal with it in Flume,

Joey Echeverria

View raw message