flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SaravanaKumar TR <saran0081...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Thu, 17 Jul 2014 06:25:23 GMT
Thanks Ashish , So I wil go ahead and update the flume-env,sh file with

JAVA_OPTS="-Xms100m -Xmx200m -Dcom.sun.management.jmxremote
-XX:-HeapDumpOnOutOfMemoryError"


On 17 July 2014 11:39, Ashish <paliwalashish@gmail.com> wrote:

> Add -XX:-HeapDumpOnOutOfMemoryError parameter as well, if your process is
> OOME, would generate a Heap dump. Allocate Heap based on the number of
> events you need to keep in channel. Try with 1 GB, but calculate according
> the Channel size as (average event size * number of events), plus object
> over heads.
>
> Please note, this is just a rough calculation, actual memory usage would
> be higher.
>
>
> On Thu, Jul 17, 2014 at 11:21 AM, SaravanaKumar TR <saran0081986@gmail.com
> > wrote:
>
>> Okay thanks , So for 128 GB , I will allocate 1 GB as a heap memory for
>> flume agent.
>>
>> But I am surprised why there was no error registered for this memory
>> issues in log file (flume.log).
>>
>> Do i need to check in any other logs?
>>
>>
>> On 16 July 2014 21:55, Jonathan Natkins <natty@streamsets.com> wrote:
>>
>>> That's definitely your problem. 20MB is way too low for this. Depending
>>> on the other processes you're running with your system, the amount of
>>> memory you'll need will vary, but I'd recommend at least 1GB. You should
>>> define it exactly where it's defined right now, so instead of the current
>>> command, you can run:
>>>
>>> "/cv/jvendor/bin/java -Xmx1g -Dflume.root.logger=DEBUG,LOGFILE......"
>>>
>>>
>>> On Wed, Jul 16, 2014 at 3:03 AM, SaravanaKumar TR <
>>> saran0081986@gmail.com> wrote:
>>>
>>>> I guess i am using defaulk values , from running flume i could see
>>>> these lines  "/cv/jvendor/bin/java -Xmx20m
>>>> -Dflume.root.logger=DEBUG,LOGFILE......"
>>>>
>>>> so i guess it takes 20 mb as agent flume memory.
>>>> My RAM is 128 GB.So please suggest how much can i assign as heap memory
>>>> and where to define it.
>>>>
>>>>
>>>> On 16 July 2014 15:05, Jonathan Natkins <natty@streamsets.com> wrote:
>>>>
>>>>> Hey Saravana,
>>>>>
>>>>> I'm attempting to reproduce this, but do you happen to know what the
>>>>> Java heap size is for your Flume agent? This information leads me to
>>>>> believe that you don't have enough memory allocated to the agent, which
you
>>>>> may need to do with the -Xmx parameter when you start up your agent.
That
>>>>> aside, you can set the byteCapacity parameter on the memory channel to
>>>>> specify how much memory it is allowed to use. It should default to 80%
of
>>>>> the Java heap size, but if your heap is too small, this might be a cause
of
>>>>> errors.
>>>>>
>>>>> Does anything get written to the log when you try to pass in an event
>>>>> of this size?
>>>>>
>>>>> Thanks,
>>>>> Natty
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR <
>>>>> saran0081986@gmail.com> wrote:
>>>>>
>>>>>> Hi Natty,
>>>>>>
>>>>>> While looking further , i could see memory channal stops if a line
>>>>>> comes with greater than 2 MB.Let me know which parameter helps us
to define
>>>>>> max event size of about 3 MB.
>>>>>>
>>>>>>
>>>>>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I am asking point 1 , because in some cases  I could see a line
in
>>>>>>> logfile around 2 MB.So i need to know what mamimum event size.How
to
>>>>>>> measure it?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Natty,
>>>>>>>>
>>>>>>>> Please help me to get the answers for the below queries.
>>>>>>>>
>>>>>>>> 1,In case of exec source , (tail -F <logfile>) , is
that each line
>>>>>>>> in file is considered to be a single event ?
>>>>>>>> If suppose a line is considered to be a event , what is that
>>>>>>>> maximum size of event supported by flume?I mean maximum characters
in a
>>>>>>>> line supported?
>>>>>>>> 2.When event stop processing , I am not seeing "tail -F"
command
>>>>>>>> running in the background.
>>>>>>>> I have used option like "a1.sources.r1.restart = true
>>>>>>>> a1.sources.r1.logStdErr = true"..
>>>>>>>> Does these config will not send any errors to flume.log if
any
>>>>>>>> issues in tail?
>>>>>>>> Will this config doesnt try to restart the "tail -F" if its
not
>>>>>>>> running in the background.
>>>>>>>>
>>>>>>>> 3.Does flume supports all formats of data in logfile or it
has any
>>>>>>>> predefined data formats..
>>>>>>>>
>>>>>>>> Please help me with these to understand better..
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Saravana,
>>>>>>>>>
>>>>>>>>> Everything here looks pretty sane. Do you have a record
of the
>>>>>>>>> events that came in leading up to the agent stopping
collection? If you can
>>>>>>>>> provide the last file created by the agent, and ideally
whatever events had
>>>>>>>>> come in, but not been written out to your HDFS sink,
it might be possible
>>>>>>>>> for me to reproduce this issue. Would it be possible
to get some sample
>>>>>>>>> data from you?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Natty
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR <
>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Natty ,
>>>>>>>>>>
>>>>>>>>>> Just to understand , at present my settings is as
>>>>>>>>>> "flume.root.logger=INFO,LOGFILE"
>>>>>>>>>> in log4j.properties , do you want me to change it
to
>>>>>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart the
agent.
>>>>>>>>>>
>>>>>>>>>> But when I start agent , I am already starting with
below
>>>>>>>>>> command.I guess i am using DEBUG already but not
in config file , while
>>>>>>>>>> starting agent.
>>>>>>>>>>
>>>>>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>>>>>>>>> /d0/flume/conf/flume-conf.properties -n a1 -Dflume.root.logger=DEBUG,LOGFILE
>>>>>>>>>>
>>>>>>>>>> If I do some changes in config "flume-conf.properties"
or restart
>>>>>>>>>> the agent , it works again and starts collecting
the data.
>>>>>>>>>>
>>>>>>>>>> currently all my logs move to flume.log , I dont
see any
>>>>>>>>>> exception .
>>>>>>>>>>
>>>>>>>>>> cat flume.log | grep "Exception"  doesnt show any.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Saravana,
>>>>>>>>>>>
>>>>>>>>>>> Our best bet on figuring out what's going on
here may be to turn
>>>>>>>>>>> on the debug logging. What I would recommend
is stopping your agents, and
>>>>>>>>>>> modifying the log4j properties to turn on DEBUG
logging for the root
>>>>>>>>>>> logger, and then restart the agents. Once the
agent stops producing new
>>>>>>>>>>> events, send out the logs and I'll be happy to
take a look over them.
>>>>>>>>>>>
>>>>>>>>>>> Does the system begin working again if you restart
the agents?
>>>>>>>>>>> Have you noticed any other events correlated
with the agent stopping
>>>>>>>>>>> collecting events? Maybe a spike in events or
something like that? And for
>>>>>>>>>>> my own peace of mind, if you run `cat /var/log/flume-ng/*
| grep
>>>>>>>>>>> "Exception"`, does it bring anything back?
>>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Natty
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar
TR <
>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Natty,
>>>>>>>>>>>>
>>>>>>>>>>>> This is my entire config file.
>>>>>>>>>>>>
>>>>>>>>>>>> # Name the components on this agent
>>>>>>>>>>>> a1.sources = r1
>>>>>>>>>>>> a1.sinks = k1
>>>>>>>>>>>> a1.channels = c1
>>>>>>>>>>>>
>>>>>>>>>>>> # Describe/configure the source
>>>>>>>>>>>> a1.sources.r1.type = exec
>>>>>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>>>>>>>>> a1.sources.r1.restart = true
>>>>>>>>>>>> a1.sources.r1.logStdErr = true
>>>>>>>>>>>>
>>>>>>>>>>>> #a1.sources.r1.batchSize = 2
>>>>>>>>>>>>
>>>>>>>>>>>> a1.sources.r1.interceptors = i1
>>>>>>>>>>>> a1.sources.r1.interceptors.i1.type = regex_filter
>>>>>>>>>>>> a1.sources.r1.interceptors.i1.regex = resuming
normal
>>>>>>>>>>>> operations|Received|Response
>>>>>>>>>>>>
>>>>>>>>>>>> #a1.sources.r1.interceptors = i2
>>>>>>>>>>>> #a1.sources.r1.interceptors.i2.type = timestamp
>>>>>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting
= true
>>>>>>>>>>>>
>>>>>>>>>>>> # Describe the sink
>>>>>>>>>>>> a1.sinks.k1.type = hdfs
>>>>>>>>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>>>>>>>>> ##need to run hive query randomly to check
teh long running
>>>>>>>>>>>> process , so we  need to commit events in
hdfs files regularly
>>>>>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>>>>>>>>>>>>
>>>>>>>>>>>> # Use a channel which buffers events in memory
>>>>>>>>>>>> a1.channels.c1.type = memory
>>>>>>>>>>>> a1.channels.c1.capacity = 10000
>>>>>>>>>>>> a1.channels.c1.transactionCapacity = 10000
>>>>>>>>>>>>
>>>>>>>>>>>> # Bind the source and sink to the channel
>>>>>>>>>>>> a1.sources.r1.channels = c1
>>>>>>>>>>>> a1.sinks.k1.channel = c1
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins <natty@streamsets.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Saravana,
>>>>>>>>>>>>>
>>>>>>>>>>>>> What does your sink configuration look
like?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Natty
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar
TR <
>>>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Assuming each line in the logfile
is considered as a event
>>>>>>>>>>>>>> for flume ,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1.Do we have any maximum size of
event defined for
>>>>>>>>>>>>>> memory/file channel.like any maximum
no of characters in a line.
>>>>>>>>>>>>>> 2.Does flume supports all formats
of data to be processed as
>>>>>>>>>>>>>> events or do we have any limitation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am just still trying to understanding
why the flume stops
>>>>>>>>>>>>>> processing events after sometime.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can someone please help me out here.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> saravana
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar
TR <
>>>>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am new to flume and  using
Apache Flume 1.5.0. Quick setup
>>>>>>>>>>>>>>> explanation here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Source:exec , tail –F command
for a logfile.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Channel: tried with both Memory
& file channel
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sink: HDFS
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When flume starts , processing
events happens properly and
>>>>>>>>>>>>>>> its moved to hdfs without any
issues.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But after sometime flume suddenly
stops sending events to
>>>>>>>>>>>>>>> HDFS.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am not seeing any errors in
logfile flume.log as
>>>>>>>>>>>>>>> well.Please let me know if I
am missing any configuration here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Below is the channel configuration
defined and I left the
>>>>>>>>>>>>>>> remaining to be default values.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a1.channels.c1.type = FILE
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a1.channels.c1.transactionCapacity
= 100000
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> a1.channels.c1.capacity = 10000000
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Saravana
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>
> --
> thanks
> ashish
>
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>

Mime
View raw message