flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Vila <av...@devlogic.eu>
Subject Re: Need suggestion on reliable source for log processing
Date Mon, 27 Oct 2014 11:21:11 GMT
Hi,

You can use spillable channel that will store events in memory and once it
fills it, it will spill to the disk.
Also, you can use file channel, but it's as fast as your disk is and it's
suggested to use a separate disk for it due to high IO with it, preferably
an SSD.

But, that will not solve the issue you might run into - if the flume fails
for whatever the reason, you'll never be able to continue from the exact
point where it failed.
Yes, File channel preserves the state, so it will continue with whatever he
already received, but what about the time while it was down ?

If you cannot change anything regarding the application that produces the
logs, then such circumstance has to be taken as a trade off.


On Mon, Oct 27, 2014 at 12:09 PM, SaravanaKumar TR <saran0081986@gmail.com>
wrote:

> Yes I understand the concerns with this use case.
>
> If so we need to configure failover in this scenario , can we have it like
> channel level ,sink channel.
>
> Does flume support to configure failover incase channel fills up.
>
>
>
> On Mon, Oct 27, 2014 at 3:54 PM, Ahmed Vila <avila@devlogic.eu> wrote:
>
>> Hi,
>>
>> In fact, this is not the problem with Flume.
>>
>> No solution will function reliably for your use case, simply because all
>> of them will have to do some sort of tail-f or streaming on a file and if
>> they can't keep up with it (they mostly don't in high speed entry points),
>> they will drop some entries.
>> Please, be kind to yourself and plan for failures - if you need to
>> restart Flume or any other solution then you'll face dropped entries that
>> you'll not be able to re-ingest easily as in most cases you won't know
>> which ones you've dropped.
>>
>>
>> Regards,
>> Ahmed
>>
>> On Mon, Oct 27, 2014 at 11:13 AM, SaravanaKumar TR <
>> saran0081986@gmail.com> wrote:
>>
>>> Thanks for comments Ahmed.
>>>
>>> So from your comments , I consider that flume doesn't have any reliable
>>> source option for use case provided by me.
>>>
>>> If flume can't provide it, can you help me with any other log collector
>>> solutions which can I consider here to move real time data to HDFS.
>>>
>>>
>>>
>>> On Mon, Oct 27, 2014 at 3:37 PM, Ahmed Vila <avila@devlogic.eu> wrote:
>>>
>>>> Hi,
>>>>
>>>> Then, you're out of luck in my opinion, as there is no way other than
>>>> tail -f.
>>>> The problem with fail-f is that tail will not wait for source/channel
>>>> to keep up with it. If Cnannel is full it will back-off to the source and
>>>> then the source will just stop ingesting.
>>>>
>>>> There is a possibility to hack up the tail -f into another file and
>>>> then custom-rotate that duplicate file.
>>>> But, I wouldn't recommend such case.
>>>>
>>>> Just a side note - If you're operating Java application (Tomcat or
>>>> similar), then you can create multiple output files via log4j.properties
>>>> configuration without application itself knowing anything about it.
>>>>
>>>> Regards,
>>>> Ahmed
>>>>
>>>>
>>>> On Mon, Oct 27, 2014 at 10:56 AM, SaravanaKumar TR <
>>>> saran0081986@gmail.com> wrote:
>>>>
>>>>> Ahmed,
>>>>>
>>>>> Here in my case , the application will rename the existing file as
>>>>> <logfile>.yesterdaydate and create a new file as <logfile>
at 00:00 AM.
>>>>>
>>>>> I can't change the log rotation policy of application for now.So I
>>>>> guess I should rule out the option of using spooling directory source
in my
>>>>> case.
>>>>>
>>>>> Can you suggest me with any other options other than spooling dir
>>>>> source.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Mon, Oct 27, 2014 at 3:10 PM, Ahmed Vila <avila@devlogic.eu>
wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> It all depends on how log rotation is done and how application
>>>>>> producing the log file handles log rotation.
>>>>>> Most of the applications just reopens the log file when it receives
a
>>>>>> kill signal. For example, nginx reopens the log file when it receives
USR1
>>>>>> signal, but it doesn't stop the process. Some applications might
restart as
>>>>>> a result.
>>>>>>
>>>>>> If the application just reopens the log file, then you can change
>>>>>> your log rotation policy to be per minute.
>>>>>> In that case logrotate daemon won't satisfy such case, so you'll
have
>>>>>> to make a cron job to do it.
>>>>>> In such case, you would separate finished logs location and live
log
>>>>>> location so the spooling directory source doesn't freak out about
active
>>>>>> log file being appended.
>>>>>>
>>>>>> Anyway, spooling directory source is a way to go, as it will leave
>>>>>> log files in place, just renamed.
>>>>>>
>>>>>> Regards,
>>>>>> Ahmed
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 27, 2014 at 10:21 AM, SaravanaKumar TR <
>>>>>> saran0081986@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am using Apache flume 1.5.0.Quick setup explanation here.
>>>>>>>
>>>>>>> Source:exec , tail –F command for a logfile.
>>>>>>>
>>>>>>> Channel:  file channel
>>>>>>>
>>>>>>> Sink: HDFS
>>>>>>>
>>>>>>> Use case:to move real time data from logfile to HDFS.
>>>>>>>
>>>>>>>
>>>>>>> It appears like exec is not a reliable source , as we may data
loss
>>>>>>> if channel/source is down.
>>>>>>>
>>>>>>>
>>>>>>> So i tried with other option "spooling directory source" which
is
>>>>>>> mentioned as reliable source.But here I have a single logfile
where data
>>>>>>> gets appended in , so I dont see option of moving the file to
spool
>>>>>>> directory.
>>>>>>>
>>>>>>>
>>>>>>> Can anyone help me with providing any other reliable source option
>>>>>>> in case where logfile gets appended with data and logfile rotation
happens
>>>>>>> only at the end of the day.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Saravana
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> This e-mail and any attachment is for authorised use by the intended
>>>>>> recipient(s) only. This email contains confidential information.
It should
>>>>>> not be copied, disclosed to, retained or used by, any party other
than the
>>>>>> intended recipient. Any unauthorised distribution, dissemination
or copying
>>>>>> of this E-mail or its attachments, and/or any use of any information
>>>>>> contained in them, is strictly prohibited and may be illegal. If
you are
>>>>>> not an intended recipient then please promptly delete this e-mail
and any
>>>>>> attachment and all copies and inform the sender directly via email.
Any
>>>>>> emails that you send to us may be monitored by systems or persons
other
>>>>>> than the named communicant for the purposes of ascertaining whether
the
>>>>>> communication complies with the law and company policies.
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> This e-mail and any attachment is for authorised use by the intended
>>>> recipient(s) only. This email contains confidential information. It should
>>>> not be copied, disclosed to, retained or used by, any party other than the
>>>> intended recipient. Any unauthorised distribution, dissemination or copying
>>>> of this E-mail or its attachments, and/or any use of any information
>>>> contained in them, is strictly prohibited and may be illegal. If you are
>>>> not an intended recipient then please promptly delete this e-mail and any
>>>> attachment and all copies and inform the sender directly via email. Any
>>>> emails that you send to us may be monitored by systems or persons other
>>>> than the named communicant for the purposes of ascertaining whether the
>>>> communication complies with the law and company policies.
>>>>
>>>
>>
>> ---------------------------------------------------------------------
>> This e-mail and any attachment is for authorised use by the intended
>> recipient(s) only. This email contains confidential information. It should
>> not be copied, disclosed to, retained or used by, any party other than the
>> intended recipient. Any unauthorised distribution, dissemination or copying
>> of this E-mail or its attachments, and/or any use of any information
>> contained in them, is strictly prohibited and may be illegal. If you are
>> not an intended recipient then please promptly delete this e-mail and any
>> attachment and all copies and inform the sender directly via email. Any
>> emails that you send to us may be monitored by systems or persons other
>> than the named communicant for the purposes of ascertaining whether the
>> communication complies with the law and company policies.
>>
>
>


-- 

Best regards,
Ahmed Vila | Senior software developer
DevLogic | Sarajevo | Bosnia and Herzegovina

Office : +387 33 942 123
Mobile: +387 62 139 348

Website: www.devlogic.eu
E-mail   : avila@devlogic.eu
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended
recipient(s) only. This email contains confidential information. It should
not be copied, disclosed to, retained or used by, any party other than the
intended recipient. Any unauthorised distribution, dissemination or copying
of this E-mail or its attachments, and/or any use of any information
contained in them, is strictly prohibited and may be illegal. If you are
not an intended recipient then please promptly delete this e-mail and any
attachment and all copies and inform the sender directly via email. Any
emails that you send to us may be monitored by systems or persons other
than the named communicant for the purposes of ascertaining whether the
communication complies with the law and company policies.

-- 
---------------------------------------------------------------------
This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. This email contains confidential information. It should 
not be copied, disclosed to, retained or used by, any party other than the 
intended recipient. Any unauthorised distribution, dissemination or copying 
of this E-mail or its attachments, and/or any use of any information 
contained in them, is strictly prohibited and may be illegal. If you are 
not an intended recipient then please promptly delete this e-mail and any 
attachment and all copies and inform the sender directly via email. Any 
emails that you send to us may be monitored by systems or persons other 
than the named communicant for the purposes of ascertaining whether the 
communication complies with the law and company policies.

Mime
View raw message