BTW.. Some more  things…
If I recall correctly, there were some code changes that went into 1.5 (or maybe 1.4) that  did seem to slowdown FC replay on startup. 
-roshan

From: Roshan Naik <roshan@hortonworks.com>
Reply-To: "user@flume.apache.org" <user@flume.apache.org>
Date: Thursday, July 23, 2015 1:29 AM
To: "user@flume.apache.org" <user@flume.apache.org>
Subject: Re: Replay log taking to much time

 Don't use -9

From: Shady Xu <shadyxu@gmail.com>
Reply-To: "user@flume.apache.org" <user@flume.apache.org>
Date: Thursday, July 23, 2015 1:23 AM
To: "user@flume.apache.org" <user@flume.apache.org>
Subject: Re: Replay log taking to much time

I didn't set this property so it has its default value true. Any other idea?

BTW, if I use `kill -9` to kill the flume process, flume will not be able to create a checkpoint, right?

2015-07-23 15:39 GMT+08:00 Roshan Naik <roshan@hortonworks.com>:
You can set the 'checkpointOnClose = true if its not already the case (default is true). This setting that was added in 1.6. 
It will create a checkpoint when flume is trying to shutdown file channel … consequently replay  on restart/reconfgure should be much quicker.

-roshan

From: Shady Xu <shadyxu@gmail.com>
Reply-To: "user@flume.apache.org" <user@flume.apache.org>
Date: Thursday, July 23, 2015 12:35 AM
To: "user@flume.apache.org" <user@flume.apache.org>
Subject: Re: Replay log taking to much time

Yes I'm using Flume 1.6 now and dualCheckpoints are also used, but every time I restart the agent, it takes less time but still dozens of minutes to replay the log. This is not normal, right?

2015-06-25 23:15 GMT+08:00 Johny Rufus <jrufus@cloudera.com>:
If the checkpointing interval is 30 seconds (by default), and dualCheckpoints are enabled (in case, the agent was interrupted while writing a checkpoint), then replay should happen only from the last 30 secs (worst case 60 secs). Not sure if this is happening in your case, or  a Full replay is happening.

Thanks,
Rufus

On Wed, Jun 24, 2015 at 10:40 PM, Shady Xu <shadyxu@gmail.com> wrote:
I have tried 1.6, replaying log has been faster, but not enough. We have G bytes of logs, replaying these logs still takes us hours even days. This is frustrating, and has been the biggest concern for us to use it in a larger scale. 

2015-06-01 15:32 GMT+08:00 Hari Shreedharan <hshreedharan@cloudera.com>:
1.6 has been released. We were waiting for maven central to sync up. Now that it is on central, I will post the update on the site tomorrow.


On Sunday, May 31, 2015, Shady Xu <shadyxu@gmail.com> wrote:
I noticed that Flume 1.6 has been released on Github but not the official website. I have compiled some of the modules from source myself (for other reasons), but I'm not sure compiling the whole project  is a good idea.

We have tons of data, every time we change the configurations, replaying log takes us way too many hours...

2015-04-17 12:38 GMT+08:00 Hari Shreedharan <hshreedharan@cloudera.com>:
Changes that went into Flume 1.6 should improve replay time. Flume 1.6 will be out in a few days.


Thanks,
Hari

On Thu, Apr 16, 2015 at 7:55 PM, Shady Xu <shadyxu@gmail.com> wrote:
Every time I restart Flume NG, it will try to replay the log and the process usually takes hours. During this time, Flume does not take any data from the source.

So how can I make the replay faster?




--

Thanks,
Hari