flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: Flume File Channel Filling Up The Disk With Transaction Log, Any Way To Prevent It
Date Mon, 25 Nov 2013 20:50:43 GMT
Lower the maxFileSize.

On Mon, Nov 25, 2013 at 2:41 PM, Ritesh Adval <riteshadval@gaikai.com> wrote:
> Hi,
>
> We are running two flume 1.4  agents each with 2 file channel on a VM of
> size 15GB.
>
> Is VM recommded to run flume or do we need bare metal boxes?
>
>
> Every week or so we are running into situation where due to our sinks on
> these agents not able to send message to upstream agents, the flume file
> channels get filled with large transaction logs.
>
> Here is what we see on 4 channels :
>
> $ du -h /srv/flume/
> 4.9G    /srv/flume/metricChannel1-Cluster/data
> 7.7M    /srv/flume/metricChannel1-Cluster/checkpoint
> 4.9G    /srv/flume/metricChannel1-Cluster
> 4.9G    /srv/flume/metricChannel2-Cluster/data
> 7.7M    /srv/flume/metricChannel2-Cluster/checkpoint
> 4.9G    /srv/flume/metricChannel2-Cluster
> 214M    /srv/flume/eventChannel2-Cluster/data
> 7.7M    /srv/flume/eventChannel2-Cluster/checkpoint
> 222M    /srv/flume/eventChannel2-Cluster
> 215M    /srv/flume/eventChannel1-Cluster/data
> 7.7M    /srv/flume/eventChannel1-Cluster/checkpoint
> 223M    /srv/flume/eventChannel1-Cluster
> 11G     /srv/flume/
>
>
> Here is an example of tx logs on metricChannel1, we are seeing 5 log files.
> Is there
> a way to restrict the number of log files kept? I think in older version of
> flume it was max 2 log files but we are seeing more than 2 as shown below:
>
>
>  $ ls -l /srv/flume/metricChannel1-Cluster/data/
> total 4.5G
> -rw-r--r-- 1 flume flume    0 Nov 23 00:39 in_use.lock
> -rw-r--r-- 1 flume flume 1.1G Nov 23 11:11 log-1
> -rw-r--r-- 1 flume flume   47 Nov 24 21:14 log-1.meta
> -rw-r--r-- 1 flume flume 1.1G Nov 23 21:18 log-2
> -rw-r--r-- 1 flume flume   47 Nov 24 21:14 log-2.meta
> -rw-r--r-- 1 flume flume 1.1G Nov 24 07:13 log-3
> -rw-r--r-- 1 flume flume   47 Nov 24 21:14 log-3.meta
> -rw-r--r-- 1 flume flume 1.1G Nov 24 17:08 log-4
> -rw-r--r-- 1 flume flume   47 Nov 24 21:14 log-4.meta
> -rw-r--r-- 1 flume flume 425M Nov 24 21:15 log-5
> -rw-r--r-- 1 flume flume   47 Nov 24 21:14 log-5.meta
>
>
> we have set maxFileSize to 1GB  and it looks like each tx log is within that
> limit and capacity on file channel to 1M message
>
> agent.channels.metricChannel2.transactionCapacity=1000
> agent.channels.metricChannel2.capacity=1000000
> agent.channels.metricChannel2.maxFileSize=1073741824
>
>
> What we want to avoid is transaction log filling up the disk,  Is there a
> way to achieve this.
> We are ok to discard the message.
>
> Thanks
> Ritesh
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org

Mime
View raw message