flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Wise <m...@nextdoor.com>
Subject Re: Flume uses high Virtual memory
Date Sat, 14 Dec 2013 16:49:50 GMT
We ran into an issue just like this when we did not limit our source
'thread' counts. The Avro source seems to spawn potentially thousands of
threads if you don't limit it:

a1.sources.r1.threads = 50

(you can validate this with 'htop')

Matt Wise
Sr. Systems Architect
Nextdoor.com


On Fri, Dec 13, 2013 at 2:58 PM, shibi S <shibis@hotmail.com> wrote:

>
> Flume Agent that is writing to HDFS is high on virtual memory usage
> (15.6g).  Agent writes to 3 different directories in HDFS based on type of
> data that is received. Configuration is given below. Any idea why VM usage
> is high?  I see high VM usage only on the Agents that is writing to HDFS.
> Other Agents are low in VM usage.
>
> Flume version : apache-flume-1.4.0 (I tested with 1.5 version as well).
>
> * PID      USER         PR  NI   VIRT    RES       SHR   S  %CPU %MEM
> TIME+          COMMAND        *
>
> 38663  deploy      20   0    15.6g  576m   15m  S   2.6
> 0.2         225:19.29    java
>
> *Configuration:*
> a1.sources.r1.selector.type = multiplexing
> a1.sources.r1.selector.header = header1
> a1.sources.r1.selector.mapping.red_cancel = c1
>
>
> *Source Configuration:*a1.sources.r1.type = avro
> a1.sources.r1.bind = 0.0.0.0
> a1.sources.r1.port = 60000
>
> *Sink configuration:*
> a1.sinks.k1.type=hdfs
> a1.sinks.k1.hdfs.path=hdfs://<HDFS PATH>/%Y/%m/%d/%H
> a1.sinks.k1.hdfs.fileType = DataStream
> a1.sinks.k1.hdfs.filePrefix = filetype1-
> a1.sinks.k1.hdfs.useLocalTimeStamp = true
> #a1.sinks.k1.hdfs.txnEventMax = 40000
> a1.sinks.k1.hdfs.rollInterval = 10
> a1.sinks.k2.hdfs.roundUnit = minute
> a1.sinks.k1.hdfs.rollSize = 0
> a1.sinks.k1.hdfs.rollCount = 500
> a1.sinks.k1.hdfs.batchSize = 500
> a1.sinks.k1.hdfs.idleTimeout =0
> a1.sinks.k1.hdfs.maxOpenFiles = 1000
>
> *Channel configuration:*
> a1.channels.c2.type=file
> a1.channels.c2.checkpointDir =/x/home/deploy/flume/checkpoint2
> a1.channels.c2.dataDirs = /x/home/deploy/flume/data2
>
>
>

Mime
View raw message