flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Madhu Gmail <madhu.munag...@gmail.com>
Subject Re: Data in File-channel data folder
Date Thu, 11 Apr 2013 20:17:46 GMT

The events from the file-channel are consumed by the sink and sent to another flume agent.

I have verified the number through jconsole on the agent and collector.
But the data is still at data directory log-1, log-2  1.6 and 1.6G respectively.

Madhu  Munagala

On Apr 11, 2013, at 3:08 PM, Mike Keane <mkeane@dotomi.com> wrote:

> Are you sure all your events were taken off the channel by the sink?  
> Did you verify all the data you sent landed at the final destination?  I
> have had my file channel backup like this when sinking to a slow source
> but eventually the file channel empties to a few MB provided I'm not
> adding data faster than the sink can remove it. 
> I have only seen a similar problem once while evaluating flume but was
> unable to reproduce.  I had 4 parallel flows.  I killed the agents in
> the storage/filter tier (http://blogs.apache.org/flume/) and let logs
> backup up in the collector tier.  I watched the file channels on the
> collector tier grow to tens of GB each before restarting the
> storage/filter tier agents.  3 of the 4 file channels backing the 4
> parallel flows drained to a few MB each.  The 4th however did not.  Even
> after I stopped putting data on the flows and verified all data
> successfully landed in the final sink location the 4th channel was still
> 50+ GB.  I stopped and restarted the agent and the agent iterated
> through all the data/checkpoint files.  Ultimately it sent a couple more
> batches of events but the channel emptied.  
> So yes, I have seen your problem however it was either explainable or
> not reproducible.   Explainable in the case where data is added to the
> channel faster than the sink can remove it and not reproducible the one
> time but Flumed fixed itself on a restart. 
> Because of the one time I witnessed the channel not clearing I will be
> monitoring the file channel size outside of flume as a precaution when
> we move flume to production. 
> Regards,
> Mike
> On 04/11/2013 02:37 PM, Madhu Gmail wrote:
>> Hello,
>> I have not heard from anyone.  so just want make sure I have explained the issue
>> I think this is a common problem for everyone who uses it flume.
>> when flume sink consumes the log event from file channel,  what will happen to the
data that is committed to local disk under data directory.
>> will it grow indefinitely  like log-1, log-2, log-3.....and so on ???
>> do I have to write script to remove the data from data directory  ??
>> Madhu  Munagala
>> (214)679-2872
>> On Apr 11, 2013, at 11:52 AM, Madhu Gmail <madhu.munagala@gmail.com> wrote:
>>> Hello,
>>> How to clean up the data  in file channel data folder.  After the log events
are processed by the sink,  I still see the log-1 and log-2 shows 1.6GB and 1.2GB.
>>> once the log events are processed by the sink,  the channel should not have any
data in data directory under file-channel ....??
>>> Madhu  Munagala
>>> (214)679-2872
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.

View raw message