flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ferenc Szabo <fsz...@cloudera.com>
Subject Re: Can HDFS sink support exactly once delivery to avoid the duplication of data
Date Fri, 24 Nov 2017 15:02:23 GMT
Dear Wenxing,

the current implementation of the HDFS sink is at least once delivery. The
exactly once delivery is a harder problem to solve, so I would not expect a
solution for that in the near future.

Regards, Ferenc Szabo

On Thu, Nov 23, 2017 at 4:07 AM, wenxing zheng <wenxing.zheng@gmail.com>

> Dear experts,
> When using the HDFS sinks with the KafkaChannel, we found that the data
> might be duplicated due to the writing timeout.
> Can the HDFS sink support the writing of the flume events in Kafka in
> exactly once?
> Appreciated for any advice.
> Regards, Wenxing

View raw message