flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashish <paliwalash...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Thu, 17 Jul 2014 06:41:50 GMT
Nope, a heap dump shall be generated. Please see more options at
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

to specify path use this -XX:HeapDumpPath=./java_pid<pid>.hprof


On Thu, Jul 17, 2014 at 12:09 PM, SaravanaKumar TR <saran0081986@gmail.com>
wrote:

> yes , sorry I missed to update as 1 GB.
>
> But for out of memory error ,do we get notified in flume logs? I haven't
> see any exception till now.
>
>
> On 17 July 2014 11:55, SaravanaKumar TR <saran0081986@gmail.com> wrote:
>
>> Thanks Ashish , So I wil go ahead and update the flume-env,sh file with
>>
>> JAVA_OPTS="-Xms100m -Xmx200m -Dcom.sun.management.jmxremote
>> -XX:-HeapDumpOnOutOfMemoryError"
>>
>>
>> On 17 July 2014 11:39, Ashish <paliwalashish@gmail.com> wrote:
>>
>>> Add -XX:-HeapDumpOnOutOfMemoryError parameter as well, if your process
>>> is OOME, would generate a Heap dump. Allocate Heap based on the number of
>>> events you need to keep in channel. Try with 1 GB, but calculate according
>>> the Channel size as (average event size * number of events), plus object
>>> over heads.
>>>
>>> Please note, this is just a rough calculation, actual memory usage would
>>> be higher.
>>>
>>>
>>> On Thu, Jul 17, 2014 at 11:21 AM, SaravanaKumar TR <
>>> saran0081986@gmail.com> wrote:
>>>
>>>> Okay thanks , So for 128 GB , I will allocate 1 GB as a heap memory for
>>>> flume agent.
>>>>
>>>> But I am surprised why there was no error registered for this memory
>>>> issues in log file (flume.log).
>>>>
>>>> Do i need to check in any other logs?
>>>>
>>>>
>>>> On 16 July 2014 21:55, Jonathan Natkins <natty@streamsets.com> wrote:
>>>>
>>>>> That's definitely your problem. 20MB is way too low for this.
>>>>> Depending on the other processes you're running with your system, the
>>>>> amount of memory you'll need will vary, but I'd recommend at least 1GB.
You
>>>>> should define it exactly where it's defined right now, so instead of
the
>>>>> current command, you can run:
>>>>>
>>>>> "/cv/jvendor/bin/java -Xmx1g -Dflume.root.logger=DEBUG,LOGFILE......"
>>>>>
>>>>>
>>>>> On Wed, Jul 16, 2014 at 3:03 AM, SaravanaKumar TR <
>>>>> saran0081986@gmail.com> wrote:
>>>>>
>>>>>> I guess i am using defaulk values , from running flume i could see
>>>>>> these lines  "/cv/jvendor/bin/java -Xmx20m
>>>>>> -Dflume.root.logger=DEBUG,LOGFILE......"
>>>>>>
>>>>>> so i guess it takes 20 mb as agent flume memory.
>>>>>> My RAM is 128 GB.So please suggest how much can i assign as heap
>>>>>> memory and where to define it.
>>>>>>
>>>>>>
>>>>>> On 16 July 2014 15:05, Jonathan Natkins <natty@streamsets.com>
wrote:
>>>>>>
>>>>>>> Hey Saravana,
>>>>>>>
>>>>>>> I'm attempting to reproduce this, but do you happen to know what
the
>>>>>>> Java heap size is for your Flume agent? This information leads
me to
>>>>>>> believe that you don't have enough memory allocated to the agent,
which you
>>>>>>> may need to do with the -Xmx parameter when you start up your
agent. That
>>>>>>> aside, you can set the byteCapacity parameter on the memory channel
to
>>>>>>> specify how much memory it is allowed to use. It should default
to 80% of
>>>>>>> the Java heap size, but if your heap is too small, this might
be a cause of
>>>>>>> errors.
>>>>>>>
>>>>>>> Does anything get written to the log when you try to pass in
an
>>>>>>> event of this size?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Natty
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 16, 2014 at 1:46 AM, SaravanaKumar TR <
>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Natty,
>>>>>>>>
>>>>>>>> While looking further , i could see memory channal stops
if a line
>>>>>>>> comes with greater than 2 MB.Let me know which parameter
helps us to define
>>>>>>>> max event size of about 3 MB.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 16 July 2014 12:46, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I am asking point 1 , because in some cases  I could
see a line in
>>>>>>>>> logfile around 2 MB.So i need to know what mamimum event
size.How to
>>>>>>>>> measure it?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Natty,
>>>>>>>>>>
>>>>>>>>>> Please help me to get the answers for the below queries.
>>>>>>>>>>
>>>>>>>>>> 1,In case of exec source , (tail -F <logfile>)
, is that each
>>>>>>>>>> line in file is considered to be a single event ?
>>>>>>>>>> If suppose a line is considered to be a event , what
is that
>>>>>>>>>> maximum size of event supported by flume?I mean maximum
characters in a
>>>>>>>>>> line supported?
>>>>>>>>>> 2.When event stop processing , I am not seeing "tail
-F" command
>>>>>>>>>> running in the background.
>>>>>>>>>> I have used option like "a1.sources.r1.restart =
true
>>>>>>>>>> a1.sources.r1.logStdErr = true"..
>>>>>>>>>> Does these config will not send any errors to flume.log
if any
>>>>>>>>>> issues in tail?
>>>>>>>>>> Will this config doesnt try to restart the "tail
-F" if its not
>>>>>>>>>> running in the background.
>>>>>>>>>>
>>>>>>>>>> 3.Does flume supports all formats of data in logfile
or it has
>>>>>>>>>> any predefined data formats..
>>>>>>>>>>
>>>>>>>>>> Please help me with these to understand better..
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Saravana,
>>>>>>>>>>>
>>>>>>>>>>> Everything here looks pretty sane. Do you have
a record of the
>>>>>>>>>>> events that came in leading up to the agent stopping
collection? If you can
>>>>>>>>>>> provide the last file created by the agent, and
ideally whatever events had
>>>>>>>>>>> come in, but not been written out to your HDFS
sink, it might be possible
>>>>>>>>>>> for me to reproduce this issue. Would it be possible
to get some sample
>>>>>>>>>>> data from you?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Natty
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar
TR <
>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Natty ,
>>>>>>>>>>>>
>>>>>>>>>>>> Just to understand , at present my settings
is as
>>>>>>>>>>>> "flume.root.logger=INFO,LOGFILE"
>>>>>>>>>>>> in log4j.properties , do you want me to change
it to
>>>>>>>>>>>> "flume.root.logger=DEBUG,LOGFILE" and restart
the agent.
>>>>>>>>>>>>
>>>>>>>>>>>> But when I start agent , I am already starting
with below
>>>>>>>>>>>> command.I guess i am using DEBUG already
but not in config file , while
>>>>>>>>>>>> starting agent.
>>>>>>>>>>>>
>>>>>>>>>>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>>>>>>>>>>> /d0/flume/conf/flume-conf.properties -n a1
-Dflume.root.logger=DEBUG,LOGFILE
>>>>>>>>>>>>
>>>>>>>>>>>> If I do some changes in config "flume-conf.properties"
or
>>>>>>>>>>>> restart the agent , it works again and starts
collecting the data.
>>>>>>>>>>>>
>>>>>>>>>>>> currently all my logs move to flume.log ,
I dont see any
>>>>>>>>>>>> exception .
>>>>>>>>>>>>
>>>>>>>>>>>> cat flume.log | grep "Exception"  doesnt
show any.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Saravana,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Our best bet on figuring out what's going
on here may be to
>>>>>>>>>>>>> turn on the debug logging. What I would
recommend is stopping your agents,
>>>>>>>>>>>>> and modifying the log4j properties to
turn on DEBUG logging for the root
>>>>>>>>>>>>> logger, and then restart the agents.
Once the agent stops producing new
>>>>>>>>>>>>> events, send out the logs and I'll be
happy to take a look over them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does the system begin working again if
you restart the agents?
>>>>>>>>>>>>> Have you noticed any other events correlated
with the agent stopping
>>>>>>>>>>>>> collecting events? Maybe a spike in events
or something like that? And for
>>>>>>>>>>>>> my own peace of mind, if you run `cat
/var/log/flume-ng/* | grep
>>>>>>>>>>>>> "Exception"`, does it bring anything
back?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>> Natty
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar
TR <
>>>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Natty,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is my entire config file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Name the components on this agent
>>>>>>>>>>>>>> a1.sources = r1
>>>>>>>>>>>>>> a1.sinks = k1
>>>>>>>>>>>>>> a1.channels = c1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Describe/configure the source
>>>>>>>>>>>>>> a1.sources.r1.type = exec
>>>>>>>>>>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>>>>>>>>>>> a1.sources.r1.restart = true
>>>>>>>>>>>>>> a1.sources.r1.logStdErr = true
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #a1.sources.r1.batchSize = 2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> a1.sources.r1.interceptors = i1
>>>>>>>>>>>>>> a1.sources.r1.interceptors.i1.type
= regex_filter
>>>>>>>>>>>>>> a1.sources.r1.interceptors.i1.regex
= resuming normal
>>>>>>>>>>>>>> operations|Received|Response
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #a1.sources.r1.interceptors = i2
>>>>>>>>>>>>>> #a1.sources.r1.interceptors.i2.type
= timestamp
>>>>>>>>>>>>>> #a1.sources.r1.interceptors.i2.preserveExisting
= true
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Describe the sink
>>>>>>>>>>>>>> a1.sinks.k1.type = hdfs
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>>>>>>>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>>>>>>>>>>> ##need to run hive query randomly
to check teh long running
>>>>>>>>>>>>>> process , so we  need to commit events
in hdfs files regularly
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>>>>>>>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp
= true
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Use a channel which buffers events
in memory
>>>>>>>>>>>>>> a1.channels.c1.type = memory
>>>>>>>>>>>>>> a1.channels.c1.capacity = 10000
>>>>>>>>>>>>>> a1.channels.c1.transactionCapacity
= 10000
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # Bind the source and sink to the
channel
>>>>>>>>>>>>>> a1.sources.r1.channels = c1
>>>>>>>>>>>>>> a1.sinks.k1.channel = c1
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 14 July 2014 22:54, Jonathan Natkins
<natty@streamsets.com
>>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Saravana,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What does your sink configuration
look like?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Natty
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jul 11, 2014 at 11:05
PM, SaravanaKumar TR <
>>>>>>>>>>>>>>> saran0081986@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Assuming each line in the
logfile is considered as a event
>>>>>>>>>>>>>>>> for flume ,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1.Do we have any maximum
size of event defined for
>>>>>>>>>>>>>>>> memory/file channel.like
any maximum no of characters in a line.
>>>>>>>>>>>>>>>> 2.Does flume supports all
formats of data to be processed
>>>>>>>>>>>>>>>> as events or do we have any
limitation.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am just still trying to
understanding why the flume stops
>>>>>>>>>>>>>>>> processing events after sometime.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can someone please help me
out here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> saravana
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11 July 2014 17:49, SaravanaKumar
TR <
>>>>>>>>>>>>>>>> saran0081986@gmail.com>
wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am new to flume and
 using Apache Flume 1.5.0. Quick
>>>>>>>>>>>>>>>>> setup explanation here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Source:exec , tail –F
command for a logfile.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Channel: tried with both
Memory & file channel
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sink: HDFS
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When flume starts , processing
events happens properly and
>>>>>>>>>>>>>>>>> its moved to hdfs without
any issues.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But after sometime flume
suddenly stops sending events to
>>>>>>>>>>>>>>>>> HDFS.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am not seeing any errors
in logfile flume.log as
>>>>>>>>>>>>>>>>> well.Please let me know
if I am missing any configuration here.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Below is the channel
configuration defined and I left the
>>>>>>>>>>>>>>>>> remaining to be default
values.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> a1.channels.c1.type =
FILE
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> a1.channels.c1.transactionCapacity
= 100000
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> a1.channels.c1.capacity
= 10000000
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Saravana
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> thanks
>>> ashish
>>>
>>> Blog: http://www.ashishpaliwal.com/blog
>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>>
>>
>>
>


-- 
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal

Mime
View raw message