flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shady Xu <shad...@gmail.com>
Subject Re: Replay log taking to much time
Date Thu, 23 Jul 2015 08:23:38 GMT
I didn't set this property so it has its default value true. Any other idea?

BTW, if I use `kill -9` to kill the flume process, flume will not be able
to create a checkpoint, right?

2015-07-23 15:39 GMT+08:00 Roshan Naik <roshan@hortonworks.com>:

>  You can set the 'checkpointOnClose = true if its not already the case
> (default is true). This setting that was added in 1.6.
> It will create a checkpoint when flume is trying to shutdown file channel …
> consequently replay  on restart/reconfgure should be much quicker.
>
>  -roshan
>
>   From: Shady Xu <shadyxu@gmail.com>
> Reply-To: "user@flume.apache.org" <user@flume.apache.org>
> Date: Thursday, July 23, 2015 12:35 AM
> To: "user@flume.apache.org" <user@flume.apache.org>
> Subject: Re: Replay log taking to much time
>
>   Yes I'm using Flume 1.6 now and dualCheckpoints are also used, but
> every time I restart the agent, it takes less time but still dozens of
> minutes to replay the log. This is not normal, right?
>
> 2015-06-25 23:15 GMT+08:00 Johny Rufus <jrufus@cloudera.com>:
>
>> If the checkpointing interval is 30 seconds (by default), and
>> dualCheckpoints are enabled (in case, the agent was interrupted while
>> writing a checkpoint), then replay should happen only from the last 30 secs
>> (worst case 60 secs). Not sure if this is happening in your case, or  a
>> Full replay is happening.
>>
>>   Thanks,
>> Rufus
>>
>> On Wed, Jun 24, 2015 at 10:40 PM, Shady Xu <shadyxu@gmail.com> wrote:
>>
>>> I have tried 1.6, replaying log has been faster, but not enough. We have
>>> G bytes of logs, replaying these logs still takes us hours even days. This
>>> is frustrating, and has been the biggest concern for us to use it in a
>>> larger scale.
>>>
>>> 2015-06-01 15:32 GMT+08:00 Hari Shreedharan <hshreedharan@cloudera.com>:
>>>
>>>> 1.6 has been released. We were waiting for maven central to sync up.
>>>> Now that it is on central, I will post the update on the site tomorrow.
>>>>
>>>>
>>>> On Sunday, May 31, 2015, Shady Xu <shadyxu@gmail.com> wrote:
>>>>
>>>>> I noticed that Flume 1.6 has been released on Github but not the
>>>>> official website. I have compiled some of the modules from source myself
>>>>> (for other reasons), but I'm not sure compiling the whole project  is
a
>>>>> good idea.
>>>>>
>>>>>  We have tons of data, every time we change the configurations,
>>>>> replaying log takes us way too many hours...
>>>>>
>>>>> 2015-04-17 12:38 GMT+08:00 Hari Shreedharan <hshreedharan@cloudera.com
>>>>> >:
>>>>>
>>>>>> Changes that went into Flume 1.6 should improve replay time. Flume
>>>>>> 1.6 will be out in a few days.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Hari
>>>>>>
>>>>>> On Thu, Apr 16, 2015 at 7:55 PM, Shady Xu <shadyxu@gmail.com>
wrote:
>>>>>>
>>>>>>> Every time I restart Flume NG, it will try to replay the log
and the
>>>>>>> process usually takes hours. During this time, Flume does not
take any data
>>>>>>> from the source.
>>>>>>>
>>>>>>>  So how can I make the replay faster?
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>  --
>>>>
>>>> Thanks,
>>>> Hari
>>>>
>>>>
>>>
>>
>

Mime
View raw message