flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Usage of use-fast-replay for FileChannel
Date Tue, 07 May 2013 04:57:53 GMT
Did you have an issue with the checkpoint that the entire 6G of data was
replayed (look for BadCheckpointException in the logs to figure out if the
channel was stopped in middle of a checkpoint)?

With the next version of Flume, you should be able to recover even if the
channel stopped while the checkpoint was being written.

Fast Replay will try to maintain order, but it will require a massive
amount of memory to run if you have a large number of events. Also, fast
replay will only run if the checkpoint is corrupt/does not exist.


On Mon, May 6, 2013 at 9:40 PM, Rahul Ravindran <rahulrv@yahoo.com> wrote:

> Hi,
>    For FileChannel, how much of a performance improvement in replay times
> were observed with use-fast-replay? We currently have use-fast-replay set
> to false and were replaying about 6 G of data. We noticed replay times of
> about one hour. I looked at the code and it appears that fast-replay does
> not guarantee the same ordering of events during replay. Is this accurate?
> Are there any other downsides of using fast-replay? Any stability concerns?
> Thanks,
> ~Rahul.

View raw message