flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish <paliwalash...@gmail.com>
Subject Re: Flume 1.4 High CPU
Date Thu, 16 Oct 2014 09:29:10 GMT
Well, I could be wrong :) The whole process takes hardly 2 min and from my
personal experience I prefer to gather data and work by elimination process.

Leave it to Mike on how he want to proceed further.

thanks
ashish

On Thu, Oct 16, 2014 at 2:47 PM, Ahmed Vila <avila@devlogic.eu> wrote:

> Hi Ashish,
>
> Sorry, but I disagree.
>
> I would agree with you in case that Mike has developed some custom
> implementation for Flume, so he would need to pin point in stack trace.
> But, he didn't. He's using quite common setup, so in my opinion it's
> either up to a hardware failure, kernel-level malfunction, Flume component
> malfunction or he just has too much incoming events for the given setup.
> Looking up a stack trace would be an overkill at this point.
>
> Regards,
> Ahmed
>
> On Thu, Oct 16, 2014 at 10:37 AM, Ashish <paliwalashish@gmail.com> wrote:
>
>> I would start with trying to find which Thread is consuming most CPU. The
>> stacktrace shall give you a good hint on the direction to proceed.
>>
>> Blogged about the process here
>> http://www.ashishpaliwal.com/blog/2011/08/finding-java-thread-consuming-high-cpu/
>>
>> Hope it help
>> ashish
>>
>> On Wed, Oct 15, 2014 at 9:02 PM, Mike Zupan <mike.zupan@manage.com>
>> wrote:
>>
>>>  I’m seeing issues with flume server using very high amounts of CPU.
>>> Just wondering if this is a common issue with a file channel. I’m pretty
>>> new to flume so sorry if this isn’t enough to debug the issue.
>>>
>>> Current top looks like
>>>
>>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>  8509 root      20   0 22.0g 8.6g 675m S 1109.4 13.7   1682:45 java
>>>  8251 root      20   0 21.9g 8.3g 647m S 1083.5 13.2   1476:27 java
>>>  7593 root      20   0 12.4g 8.4g  18m S 1007.5 13.4   1866:18 java
>>>
>>> As you can see we have 3 out of 4 flume servers using 1000% cpu.
>>>
>>> Details are
>>>
>>> OS: CentOS 6.5
>>> Java: Oracle "1.7.0_45"
>>> Flume: flume-1.4.0.2.1.1.0-385.el6.noarch
>>>
>>> Our config for the server looks like this
>>>
>>> ###############################################
>>> # Agent configuration for transactional data
>>> ###############################################
>>> nontx_host07_agent01.sources = avro
>>> nontx_host07_agent01.channels = fc
>>> nontx_host07_agent01.sinks = hdfs_sink_01 hdfs_sink_02 hdfs_sink_03
>>> hdfs_sink_04
>>>
>>> ##################################################
>>> # info is published to port 9991
>>> ##################################################
>>> nontx_host07_agent01.sources.avro.type = avro
>>> nontx_host07_agent01.sources.avro.bind = 0.0.0.0
>>> nontx_host07_agent01.sources.avro.port = 9991
>>> nontx_host07_agent01.sources.avro.threads = 100
>>> nontx_host07_agent01.sources.avro.compression-type = deflate
>>> nontx_host07_agent01.sources.avro.interceptors = ts id
>>> nontx_host07_agent01.sources.avro.interceptors.ts.type = timestamp
>>> nontx_host07_agent01.sources.avro.interceptors.ts.preserveExisting =
>>> false
>>> nontx_host07_agent01.sources.avro.interceptors.id.type =
>>> org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder
>>> nontx_host07_agent01.sources.avro.interceptors.id.preserveExisting = true
>>>
>>>
>>> ##################################################
>>> # The Channels
>>> ##################################################
>>> nontx_host07_agent01.channels.fc.type = file
>>> nontx_host07_agent01.channels.fc.checkpointDir =
>>> /flume/channels/checkpoint/nontx_host07_agent01
>>> nontx_host07_agent01.channels.fc.dataDirs =
>>> /flume/channels/data/nontx_host07_agent01
>>> nontx_host07_agent01.channels.fc.capacity = 140000000
>>> nontx_host07_agent01.channels.fc.transactionCapacity = 240000
>>>
>>> ##################################################
>>> # Sinks
>>> ##################################################
>>> nontx_host07_agent01.sinks.hdfs_sink_01.type = hdfs
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.path =
>>> hdfs://cluster01:8020/flume/%{log_type}
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.filePrefix =
>>> flume_nontx_host07_agent01_sink01_%Y%m%d%H
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUsePrefix=_
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.inUseSuffix=.tmp
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.fileType = CompressedStream
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.codeC = snappy
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollSize = 0
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollCount = 0
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.rollInterval = 300
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.idleTimeout = 30
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.timeZone =
>>> America/Los_Angeles
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.callTimeout = 30000
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.batchSize = 50000
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.round = true
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundUnit = minute
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.roundValue = 5
>>> nontx_host07_agent01.sinks.hdfs_sink_01.hdfs.threadsPoolSize = 2
>>> nontx_host07_agent01.sinks.hdfs_sink_01.serializer =
>>> com.manage.flume.serialization.HeaderAndBodyJsonEventSerializer$Builder
>>>
>>> --
>>> Mike Zupan
>>>
>>>
>>
>>
>> --
>> thanks
>> ashish
>>
>> Blog: http://www.ashishpaliwal.com/blog
>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>
>
> ---------------------------------------------------------------------
> This e-mail and any attachment is for authorised use by the intended
> recipient(s) only. This email contains confidential information. It should
> not be copied, disclosed to, retained or used by, any party other than the
> intended recipient. Any unauthorised distribution, dissemination or copying
> of this E-mail or its attachments, and/or any use of any information
> contained in them, is strictly prohibited and may be illegal. If you are
> not an intended recipient then please promptly delete this e-mail and any
> attachment and all copies and inform the sender directly via email. Any
> emails that you send to us may be monitored by systems or persons other
> than the named communicant for the purposes of ascertaining whether the
> communication complies with the law and company policies.
>



-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Mime
View raw message