flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brock Noland <br...@cloudera.com>
Subject Re: post-processing
Date Fri, 21 Dec 2012 20:46:34 GMT
I wouldn't modify the files while flume is also modifying them. It
might work but also might be a complete mess. If you need to modify
the events before being written interceptors are the correct solution.
After the file is done from a flume perspective, modify all you wish!

On Fri, Dec 21, 2012 at 2:26 PM, Cochran, David <david.cochran@bsee.gov> wrote:
> just had a thought... before I turn this script up and make a mess of things
> I figured I'd ask the group...
> I'm running FLUME 1.3 running using FILE_ROLL at the sink.... the 'live in
> use' files are being periodically scanned for key events while still "live'
> and being appending to by Flume... no problems there as they are just being
> read....
> now the interesting part, I also need to do a little processing of the
> stored logs (using sed) to insert a couple pieces of data into each line (if
> it doesn't already exist) before my log scanner process does it's thing.
> I'm not sure what the odds are of this NOT totally hosing the flume
> process/data will be...maybe recognizes the file is in use and waits? The
> files are processed by sed pretty quickly ( ~15 secs) as they are rotated
> daily.
> Has anyone else tried this yet or have any insight as to how Flume might
> react before I attempt to make bit soup?
> Thanks,
> -Dave

Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/

View raw message