You can set the 'checkpointOnClose = true if its not already the case (default is true). This setting that was added in 1.6.It will create a checkpoint when flume is trying to shutdown file channel … consequently replay on restart/reconfgure should be much quicker.
From: Shady Xu <firstname.lastname@example.org>
Reply-To: "email@example.com" <firstname.lastname@example.org>
Date: Thursday, July 23, 2015 12:35 AM
To: "email@example.com" <firstname.lastname@example.org>
Subject: Re: Replay log taking to much time
Yes I'm using Flume 1.6 now and dualCheckpoints are also used, but every time I restart the agent, it takes less time but still dozens of minutes to replay the log. This is not normal, right?
2015-06-25 23:15 GMT+08:00 Johny Rufus <email@example.com>:
If the checkpointing interval is 30 seconds (by default), and dualCheckpoints are enabled (in case, the agent was interrupted while writing a checkpoint), then replay should happen only from the last 30 secs (worst case 60 secs). Not sure if this is happening in your case, or a Full replay is happening.
On Wed, Jun 24, 2015 at 10:40 PM, Shady Xu <firstname.lastname@example.org> wrote:
I have tried 1.6, replaying log has been faster, but not enough. We have G bytes of logs, replaying these logs still takes us hours even days. This is frustrating, and has been the biggest concern for us to use it in a larger scale.
2015-06-01 15:32 GMT+08:00 Hari Shreedharan <email@example.com>:
1.6 has been released. We were waiting for maven central to sync up. Now that it is on central, I will post the update on the site tomorrow.--
On Sunday, May 31, 2015, Shady Xu <firstname.lastname@example.org> wrote:
I noticed that Flume 1.6 has been released on Github but not the official website. I have compiled some of the modules from source myself (for other reasons), but I'm not sure compiling the whole project is a good idea.
We have tons of data, every time we change the configurations, replaying log takes us way too many hours...
2015-04-17 12:38 GMT+08:00 Hari Shreedharan <email@example.com>:
Changes that went into Flume 1.6 should improve replay time. Flume 1.6 will be out in a few days.
On Thu, Apr 16, 2015 at 7:55 PM, Shady Xu <firstname.lastname@example.org> wrote:
Every time I restart Flume NG, it will try to replay the log and the process usually takes hours. During this time, Flume does not take any data from the source.
So how can I make the replay faster?