flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Chavez <pcha...@verticalsearchworks.com>
Subject RE: Flume to stream logs live
Date Fri, 14 Dec 2012 16:24:18 GMT
I have setup a Windows flume flow using LogParser and the AvroClient app bundled with flume.

It's a Powershell script scheduled every 5 minutes which runs a checkpointed query via LogParser
to create incremental files for IIS logs and a couple other of our app logs. Then the incremental
files are sent to a flume node running AvroSource. From there it's a typical flume setup,
the log types are split based on a header that I append when sending via the AvroClient and
then sent to collector nodes that sink to HDFS.

It's currently a best effort architecture as I don't trap any errors from the AvroClient on
the Windows side. I did extend the AvroClient to kick out exit codes though, just not using
it yet (see https://issues.apache.org/jira/browse/FLUME-1670). I've been sending about 15GB
of IIS logs per day per server without issues, though.

It's not the best solution but it works for now. Longer term we are thinking of a custom app
on our side that leverages the HTTPSource, or if we get ambitious implementing the AvroRPC
in .net but that's a backburner project right now.

Also, I'm bucketing the events based on a timestamp interceptor which has caused post processing
pain as the event timestamps are off by ~5 minutes from the header. I'm looking forward to
using regex capture interceptor to timestamp the events with the event time soon.

Paul Chavez

-----Original Message-----
From: Brock Noland [mailto:brock@cloudera.com] 
Sent: Friday, December 14, 2012 6:52 AM
To: user@flume.apache.org
Subject: Re: Flume to stream logs live


FWIW, I was sending log data from Windows I would write a little Windows Log Agent and send
the data to the HTTP Source.


On Fri, Dec 14, 2012 at 8:47 AM, Kartashov, Andy <Andy.Kartashov@mpac.ca> wrote:
> Flummers,
> Loved working with Flume 1.2 - very easy and simple configuration, it 
> was a pleasure to work with. Managed to "tail -F" logs from unix 
> server and into a hdfs cluster. The problem started when I also needed 
> to push logs from a Windows application server.  Spent three days 
> researching on how to install flume on Windows and run  a deamon/agent 
> that will push the logs to the Avro source I successfully configured 
> and ran on Unix. No luck. So I am looking t alternative. Is there 
> other framework available out there to help me with my issue. What about scribe?
> Andy Kartashov
> IT Architecture, Co-op
> 1340 Pickering Parkway, Pickering, L1V 0C4
> ( Phone : (905) 837 6269
> ( Mobile: (416) 722 1787
> andy.kartashov@mpac.ca
> NOTICE: This e-mail message and any attachments are confidential, 
> subject to copyright and may be privileged. Any unauthorized use, 
> copying or disclosure is prohibited. If you are not the intended 
> recipient, please delete and contact the sender immediately. Please 
> consider the environment before printing this e-mail. AVIS : le 
> présent courriel et toute pièce jointe qui l'accompagne sont 
> confidentiels, protégés par le droit d'auteur et peuvent être couverts 
> par le secret professionnel. Toute utilisation, copie ou divulgation 
> non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel,
supprimez-le et contactez immédiatement l'expéditeur.
> Veuillez penser à l'environnement avant d'imprimer le présent courriel

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

View raw message