flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sadananda Hegde <saduhe...@gmail.com>
Subject Re: picking up new files in Flume NG
Date Tue, 16 Oct 2012 17:00:02 GMT
Yes, It is very similar.

The spool directory will keep getting new files. We need to scan through
the directory, send the data in the existing files to HDFS , cleanup the
files (delete / move/ rename, etc) and scan for new files again. The
Spooldir source is not available yet, right?

Thanks,
Sadu

On Tue, Oct 16, 2012 at 10:11 AM, Brock Noland <brock@cloudera.com> wrote:

> Sounds like https://issues.apache.org/jira/browse/FlUME-1425  ?
>
> Brock
>
> On Mon, Oct 15, 2012 at 11:37 PM, Sadananda Hegde <saduhegde@gmail.com>
> wrote:
> > Hello,
> >
> > I have a scenario where in the client application is continuously pushing
> > xml messages. Actually the application is writing these messages to files
> > (new files; same directory). So we will be keep getting new files
> throughout
> > the day. I am trying to configure Flume agents on these applcation
> servers
> > (4 of them) to pick up the new data and transfer them to HDFS on a hadoop
> > cluster. How should I configure my source to pick up new files (and
> exclude
> > the files that have been processed already)? I don't think Exec source
> with
> > tail  -F will work in this scenario because data is not getting added to
> > existing files; rather new files get created.
> >
> > Thank you very much for your time and support.
> >
> > Sadu
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce -
> http://incubator.apache.org/mrunit/
>

Mime
View raw message