flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Percy <mpe...@apache.org>
Subject Re: Reliability in Flume
Date Thu, 24 Jan 2013 05:22:27 GMT
Please see inline...

On Wed, Jan 23, 2013 at 7:26 PM, Henry Ma <henry.ma.1986@gmail.com> wrote:

> Dear Flume developers and users,
> I understand that Flume NG uses channel-based transactions to guarantee
> reliable message delivery between agents. But in some extreme failure
> scenes, will Flume keep total Reliability? I have thought of these scenes
> below.
> 1. In transactions between agent, what will happen if the receiving agent
> process down just after it commits its put transaction and before sends the
> success indication to the sending agent? Will the sending agent send the
> same event again when the receiving agent recovers, and cause data
> duplication?

Yes it will cause duplication in this case. But it's not that common if you
do proper capacity planning and tuning.

2. In the communication between the client (data source, sending data to
> the first-hop agent) and the  first-hop agent, what will happen if the
> agent process down just after it receives the event and before saves to its
> channel? Will it cause data loss?

It will not cause data loss because it saves to the channel before
acknowledging the transaction.

3. In the communication between the final-hup agent and the storage system
> (such as MySQL, HDFS, file system, etc.), what happened if the agent down
> before it commits the saving transaction but has saved some data in the
> storage? Will this cause data duplication after the recover of the agent?

Yes, this scenario can also cause duplicates.


View raw message