flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject Re: beginner's question -- file source configuration
Date Mon, 09 Mar 2015 01:24:58 GMT
Thanks Ashish,

Followed your guidance, and found below instructions of which have further
questions to confirm with you, it seems we need to close the files and
never touch it for Flume to process correctly, so not sure if it is good
practice that -- (1) let the application write log file in existing way,
like hourly or 5 mins pattern, (2) close and move the files to another
directory as input Source for Flume Agent which Flume could process as
Spooling Directory?

“This source will watch the specified directory for new files, and will
parse events out of new files as they appear. ”


   1. If a file is written to after being placed into the spooling
   directory, Flume will print an error to its log file and stop processing.
   2. If a file name is reused at a later time, Flume will print an error
   to its log file and stop processing.



On Sun, Mar 8, 2015 at 12:23 AM, Ashish <paliwalashish@gmail.com> wrote:

> Please look at following
> Spooling Directory Source
> [http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source]
> and
> HDFS Sink (http://flume.apache.org/FlumeUserGuide.html#hdfs-sink)
> Spooling Directory Source need immutable files, means files should not
> be written to once they are being consumed. In short your application
> cannot write to the file being read by Flume.
> Log format is not an issue, as long as you don't want it to be
> interpreted by Flume components. Since it's log assuming single log
> per line with line separator at the end of line.
> You can also look at Exec source
> (http://flume.apache.org/FlumeUserGuide.html#exec-source) for tailing
> to a file being written by application. Documentation covers details
> on all the links.
> HTH !
> On Sun, Mar 8, 2015 at 12:32 PM, Lin Ma <linlma@gmail.com> wrote:
> > Hi Flume masters,
> >
> > I want to install Flume on a box, and consume local log file as source
> and
> > send to remote HDFS sink. The log format is private and text (not Avro or
> > JSON format).
> >
> > I am reading the guide on Flume and many advanced Source configuration,
> > wondering for the plain local log file source, any reference samples? And
> > not sure if Flume could consume the local file while the application is
> > still writing the log file? Thanks.
> >
> > regards,
> > Lin
> --
> thanks
> ashish
> Blog: http://www.ashishpaliwal.com/blog
> My Photo Galleries: http://www.pbase.com/ashishpaliwal

View raw message