flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SaravanaKumar TR <saran0081...@gmail.com>
Subject Re: Flume stops processing event after a while
Date Wed, 16 Jul 2014 07:16:41 GMT
I am asking point 1 , because in some cases  I could see a line in logfile
around 2 MB.So i need to know what mamimum event size.How to measure it?




On 16 July 2014 10:18, SaravanaKumar TR <saran0081986@gmail.com> wrote:

> Hi Natty,
>
> Please help me to get the answers for the below queries.
>
> 1,In case of exec source , (tail -F <logfile>) , is that each line in file
> is considered to be a single event ?
> If suppose a line is considered to be a event , what is that maximum size
> of event supported by flume?I mean maximum characters in a line supported?
> 2.When event stop processing , I am not seeing "tail -F" command running
> in the background.
> I have used option like "a1.sources.r1.restart = true
> a1.sources.r1.logStdErr = true"..
> Does these config will not send any errors to flume.log if any issues in
> tail?
> Will this config doesnt try to restart the "tail -F" if its not running in
> the background.
>
> 3.Does flume supports all formats of data in logfile or it has any
> predefined data formats..
>
> Please help me with these to understand better..
>
>
>
> On 16 July 2014 00:56, Jonathan Natkins <natty@streamsets.com> wrote:
>
>> Saravana,
>>
>> Everything here looks pretty sane. Do you have a record of the events
>> that came in leading up to the agent stopping collection? If you can
>> provide the last file created by the agent, and ideally whatever events had
>> come in, but not been written out to your HDFS sink, it might be possible
>> for me to reproduce this issue. Would it be possible to get some sample
>> data from you?
>>
>> Thanks,
>> Natty
>>
>>
>> On Tue, Jul 15, 2014 at 10:26 AM, SaravanaKumar TR <
>> saran0081986@gmail.com> wrote:
>>
>>> Hi Natty ,
>>>
>>> Just to understand , at present my settings is as
>>> "flume.root.logger=INFO,LOGFILE"
>>> in log4j.properties , do you want me to change it to
>>> "flume.root.logger=DEBUG,LOGFILE" and restart the agent.
>>>
>>> But when I start agent , I am already starting with below command.I
>>> guess i am using DEBUG already but not in config file , while starting
>>> agent.
>>>
>>> ../bin/flume-ng agent -c /d0/flume/conf -f
>>> /d0/flume/conf/flume-conf.properties -n a1 -Dflume.root.logger=DEBUG,LOGFILE
>>>
>>> If I do some changes in config "flume-conf.properties" or restart the
>>> agent , it works again and starts collecting the data.
>>>
>>> currently all my logs move to flume.log , I dont see any exception .
>>>
>>> cat flume.log | grep "Exception"  doesnt show any.
>>>
>>>
>>> On 15 July 2014 22:24, Jonathan Natkins <natty@streamsets.com> wrote:
>>>
>>>> Hi Saravana,
>>>>
>>>> Our best bet on figuring out what's going on here may be to turn on the
>>>> debug logging. What I would recommend is stopping your agents, and
>>>> modifying the log4j properties to turn on DEBUG logging for the root
>>>> logger, and then restart the agents. Once the agent stops producing new
>>>> events, send out the logs and I'll be happy to take a look over them.
>>>>
>>>> Does the system begin working again if you restart the agents? Have you
>>>> noticed any other events correlated with the agent stopping collecting
>>>> events? Maybe a spike in events or something like that? And for my own
>>>> peace of mind, if you run `cat /var/log/flume-ng/* | grep "Exception"`,
>>>> does it bring anything back?
>>>>
>>>> Thanks!
>>>> Natty
>>>>
>>>>
>>>> On Tue, Jul 15, 2014 at 2:55 AM, SaravanaKumar TR <
>>>> saran0081986@gmail.com> wrote:
>>>>
>>>>> Hi Natty,
>>>>>
>>>>> This is my entire config file.
>>>>>
>>>>> # Name the components on this agent
>>>>> a1.sources = r1
>>>>> a1.sinks = k1
>>>>> a1.channels = c1
>>>>>
>>>>> # Describe/configure the source
>>>>> a1.sources.r1.type = exec
>>>>> a1.sources.r1.command = tail -F /data/logs/test_log
>>>>> a1.sources.r1.restart = true
>>>>> a1.sources.r1.logStdErr = true
>>>>>
>>>>> #a1.sources.r1.batchSize = 2
>>>>>
>>>>> a1.sources.r1.interceptors = i1
>>>>> a1.sources.r1.interceptors.i1.type = regex_filter
>>>>> a1.sources.r1.interceptors.i1.regex = resuming normal
>>>>> operations|Received|Response
>>>>>
>>>>> #a1.sources.r1.interceptors = i2
>>>>> #a1.sources.r1.interceptors.i2.type = timestamp
>>>>> #a1.sources.r1.interceptors.i2.preserveExisting = true
>>>>>
>>>>> # Describe the sink
>>>>> a1.sinks.k1.type = hdfs
>>>>> a1.sinks.k1.hdfs.path = hdfs://
>>>>> testing.sck.com:9000/running/test.sck/date=%Y-%m-%d
>>>>> a1.sinks.k1.hdfs.writeFormat = Text
>>>>> a1.sinks.k1.hdfs.fileType = DataStream
>>>>> a1.sinks.k1.hdfs.filePrefix = events-
>>>>> a1.sinks.k1.hdfs.rollInterval = 600
>>>>> ##need to run hive query randomly to check teh long running process ,
>>>>> so we  need to commit events in hdfs files regularly
>>>>> a1.sinks.k1.hdfs.rollCount = 0
>>>>> a1.sinks.k1.hdfs.batchSize = 10
>>>>> a1.sinks.k1.hdfs.rollSize = 0
>>>>> a1.sinks.k1.hdfs.useLocalTimeStamp = true
>>>>>
>>>>> # Use a channel which buffers events in memory
>>>>> a1.channels.c1.type = memory
>>>>> a1.channels.c1.capacity = 10000
>>>>> a1.channels.c1.transactionCapacity = 10000
>>>>>
>>>>> # Bind the source and sink to the channel
>>>>> a1.sources.r1.channels = c1
>>>>> a1.sinks.k1.channel = c1
>>>>>
>>>>>
>>>>> On 14 July 2014 22:54, Jonathan Natkins <natty@streamsets.com>
wrote:
>>>>>
>>>>>> Hi Saravana,
>>>>>>
>>>>>> What does your sink configuration look like?
>>>>>>
>>>>>> Thanks,
>>>>>> Natty
>>>>>>
>>>>>>
>>>>>> On Fri, Jul 11, 2014 at 11:05 PM, SaravanaKumar TR <
>>>>>> saran0081986@gmail.com> wrote:
>>>>>>
>>>>>>> Assuming each line in the logfile is considered as a event for
flume
>>>>>>> ,
>>>>>>>
>>>>>>> 1.Do we have any maximum size of event defined for memory/file
>>>>>>> channel.like any maximum no of characters in a line.
>>>>>>> 2.Does flume supports all formats of data to be processed as
events
>>>>>>> or do we have any limitation.
>>>>>>>
>>>>>>> I am just still trying to understanding why the flume stops
>>>>>>> processing events after sometime.
>>>>>>>
>>>>>>> Can someone please help me out here.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> saravana
>>>>>>>
>>>>>>>
>>>>>>> On 11 July 2014 17:49, SaravanaKumar TR <saran0081986@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi ,
>>>>>>>>
>>>>>>>> I am new to flume and  using Apache Flume 1.5.0. Quick setup
>>>>>>>> explanation here.
>>>>>>>>
>>>>>>>> Source:exec , tail –F command for a logfile.
>>>>>>>>
>>>>>>>> Channel: tried with both Memory & file channel
>>>>>>>>
>>>>>>>> Sink: HDFS
>>>>>>>>
>>>>>>>> When flume starts , processing events happens properly and
its
>>>>>>>> moved to hdfs without any issues.
>>>>>>>>
>>>>>>>> But after sometime flume suddenly stops sending events to
HDFS.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I am not seeing any errors in logfile flume.log as well.Please
let
>>>>>>>> me know if I am missing any configuration here.
>>>>>>>>
>>>>>>>>
>>>>>>>> Below is the channel configuration defined and I left the
remaining
>>>>>>>> to be default values.
>>>>>>>>
>>>>>>>>
>>>>>>>> a1.channels.c1.type = FILE
>>>>>>>>
>>>>>>>> a1.channels.c1.transactionCapacity = 100000
>>>>>>>>
>>>>>>>> a1.channels.c1.capacity = 10000000
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Saravana
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message