flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: picking up new files in Flume NG
Date Tue, 16 Oct 2012 17:47:42 GMT
Correct, it's only available in that patch, from the RB it looks like
it's not too far off from being committed.

Brock

On Tue, Oct 16, 2012 at 12:00 PM, Sadananda Hegde <saduhegde@gmail.com> wrote:
> Yes, It is very similar.
>
> The spool directory will keep getting new files. We need to scan through the
> directory, send the data in the existing files to HDFS , cleanup the files
> (delete / move/ rename, etc) and scan for new files again. The Spooldir
> source is not available yet, right?
>
> Thanks,
> Sadu
>
>
> On Tue, Oct 16, 2012 at 10:11 AM, Brock Noland <brock@cloudera.com> wrote:
>>
>> Sounds like https://issues.apache.org/jira/browse/FlUME-1425  ?
>>
>> Brock
>>
>> On Mon, Oct 15, 2012 at 11:37 PM, Sadananda Hegde <saduhegde@gmail.com>
>> wrote:
>> > Hello,
>> >
>> > I have a scenario where in the client application is continuously
>> > pushing
>> > xml messages. Actually the application is writing these messages to
>> > files
>> > (new files; same directory). So we will be keep getting new files
>> > throughout
>> > the day. I am trying to configure Flume agents on these applcation
>> > servers
>> > (4 of them) to pick up the new data and transfer them to HDFS on a
>> > hadoop
>> > cluster. How should I configure my source to pick up new files (and
>> > exclude
>> > the files that have been processed already)? I don't think Exec source
>> > with
>> > tail  -F will work in this scenario because data is not getting added to
>> > existing files; rather new files get created.
>> >
>> > Thank you very much for your time and support.
>> >
>> > Sadu
>>
>>
>>
>> --
>> Apache MRUnit - Unit testing MapReduce -
>> http://incubator.apache.org/mrunit/
>
>



-- 
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

Mime
View raw message