flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Question Failure Behavior of HDFS Sink
Date Tue, 08 Sep 2015 18:14:57 GMT
Thanks for your answer.

Aljoscha

On Tue, 8 Sep 2015 at 20:04 Johny Rufus <jrufus@cloudera.com> wrote:

> Your assumption is correct, as duplicates in a failure scenario will occur.
>
> Thanks,
> Rufus
>
> On Tue, Sep 8, 2015 at 4:10 AM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
>> Hi,
>> as I understand it the HDFS sink uses the transaction system to verify
>> that all the elements in a transaction are written. This is what I would
>> call at-least-once semantics.
>>
>> My question is now what happens if the writing fails in the middle of
>> writing the elements in the transaction. When the transaction is retried
>> some of the elements might be written again, i.e. the output contains
>> duplicates. Is this assumption correct or is there something in place that
>> prevents this from happening?
>>
>> Thanks for your time,
>> Aljoscha
>>
>
>

Mime
View raw message