flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From iain wright <iainw...@gmail.com>
Subject Re: GC errors
Date Wed, 22 Mar 2017 19:17:06 GMT
Does the output of JPS or PS show it was started with those parameters?
Maybe the env file is not being sourced in correctly and its starting with
a default heap?

-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Wed, Mar 22, 2017 at 12:12 PM, Suresh V <verditer@gmail.com> wrote:

> The events are only about 5 to 10 kilobytes.
>
> The flume-env.sh  has only the below line, I was trying with really high
> memory allocations as well,
> export JAVA_OPTS="-Xms8000m -Xmx32000m -Dcom.sun.management.jmxremote"
>
> The box has 120GB memory and nothing else is running on it. It is an AWS
> EC2.
>
> The OOM occurs immediately after the agent starts and creating the first
> .tmp file...
>
> Suresh.
>
>
> On Wed, Mar 22, 2017 at 1:57 PM, iain wright <iainwrig@gmail.com> wrote:
>
>> Config seems sane, im not familar with the rounding values.
>> transactionCapacity seems a bit high, getting max 10k events from the
>> source at a time (i've only ever used 100, perhaps 10k is normal).
>>
>> How big is each event?
>>
>> Could you also paste flume-env and the output of sudo jps -v or ps
>> auxww|grep -i flume after starting flume?
>>
>> Are events flowing all the way through and flushed to files every 60s or
>> 20k events? Do you see queueing in the channel? Is the OOM after some time?
>>
>> JAVA_OPTS="-Xms1024m -Xmx3072m" previously worked for me in flume-env.sh
>>
>>
>>
>> --
>> Iain Wright
>>
>> This email message is confidential, intended only for the recipient(s)
>> named above and may contain information that is privileged, exempt from
>> disclosure under applicable law. If you are not the intended recipient, do
>> not disclose or disseminate the message to anyone except the intended
>> recipient. If you have received this message in error, or are not the named
>> recipient(s), please immediately notify the sender by return email, and
>> delete all copies of this message.
>>
>> On Wed, Mar 22, 2017 at 11:22 AM, Suresh V <verditer@gmail.com> wrote:
>>
>>> Here it is:
>>>
>>> # Name the components on this agent
>>> myagent.sources = r1
>>> myagent.sinks = k1
>>> myagent.channels = c1
>>>
>>> myagent.sources.r1.type = com.aweber.flume.source.rabbit
>>> mq.RabbitMQSource
>>> myagent.sources.r1.host = xxx.yyy.com
>>> myagent.sources.r1.port = 5671
>>> myagent.sources.r1.username = xxx
>>> myagent.sources.r1.password = xxxx
>>> myagent.sources.r1.queue = QUEUENAME
>>> myagent.sources.r1.virtual-host = VH
>>> myagent.sources.r1.prefetchCount = 10
>>> myagent.sources.r1.ssl = true
>>>
>>>
>>> # Describe the sink
>>> myagent.sinks.k1.type = hdfs
>>> myagent.sinks.k1.hdfs.path = /hdfs/path/to/folder/
>>> myagent.sinks.k1.hdfs.filePrefix = filename_%Y%m%d.%H%M%S
>>> myagent.sinks.k1.hdfs.round = true
>>> myagent.sinks.k1.hdfs.roundUnit = second
>>> myagent.sinks.k1.hdfs.roundValue = 30
>>> myagent.sinks.k1.hdfs.useLocalTimeStamp = true
>>> myagent.sinks.k1.hdfs.timeZone = America/Chicago
>>> myagent.sinks.k1.hdfs.writeFormat = Text
>>> myagent.sinks.k1.hdfs.fileType = DataStream
>>> myagent.sinks.k1.hdfs.batchSize = 20000
>>> myagent.sinks.k1.hdfs.fileSuffix = .txt
>>> myagent.sinks.k1.hdfs.rollCount = 0
>>> myagent.sinks.k1.hdfs.rollSize = 0
>>> myagent.sinks.k1.hdfs.rollInterval = 60
>>>
>>> # Use a channel which buffers events in memory
>>>
>>> myagent.channels.c1.type = file
>>> myagent.channels.c1.capacity = 10000
>>> myagent.channels.c1.transactionCapacity = 10000
>>> myagent.channels.c1.dataDirs = /local/path
>>> myagent.channels.c1.checkpointDir = /local/path/checkpoint/
>>>
>>>
>>> # Bind the source and sink to the channel
>>> myagent.sources.r1.channels = c1
>>> myagent.sinks.k1.channel = c1
>>>
>>> Thank you
>>>
>>>
>>> On Wed, Mar 22, 2017 at 12:56 PM, iain wright <iainwrig@gmail.com>
>>> wrote:
>>>
>>>> Can you please drop your config in a reply or pastebin (omitting any
>>>> sensitive info)
>>>>
>>>> --
>>>> Iain Wright
>>>>
>>>> This email message is confidential, intended only for the recipient(s)
>>>> named above and may contain information that is privileged, exempt from
>>>> disclosure under applicable law. If you are not the intended recipient, do
>>>> not disclose or disseminate the message to anyone except the intended
>>>> recipient. If you have received this message in error, or are not the named
>>>> recipient(s), please immediately notify the sender by return email, and
>>>> delete all copies of this message.
>>>>
>>>> On Wed, Mar 22, 2017 at 10:54 AM, Suresh V <verditer@gmail.com> wrote:
>>>>
>>>>> Hello Flume users,
>>>>>
>>>>> I'm getting this error when starting the agent. The source is a rabbit
>>>>> mq that has millions of messages, channel is file and sink is HDFS..
>>>>>
>>>>> Exception: java.lang.OutOfMemoryError thrown from the
>>>>> UncaughtExceptionHandler in thread "RabbitMQ Consumer #0"
>>>>> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor"
>>>>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>>> Exception in thread "pool-8-thread-1" java.lang.OutOfMemoryError: GC
>>>>> overhead limit exceeded
>>>>> ^CException in thread "Thread-0" java.lang.OutOfMemoryError: GC
>>>>> overhead limit exceeded
>>>>> Exception in thread "agent-shutdown-hook" java.lang.OutOfMemoryError:
>>>>> GC overhead limit exceeded
>>>>>
>>>>> I have tried increasing the JAVA_OPTS min and max in flume-env.sh but
>>>>> that has not helped.
>>>>>
>>>>> Any help appreciated.
>>>>>
>>>>> Thank you
>>>>> Suresh.
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message