flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Myers <josh.my...@mydrivesolutions.com>
Subject RE: Flume events rolling file too regularly
Date Mon, 17 Jun 2013 17:15:06 GMT
Thanks for that Paul,  we have rollInterval set at 21600 and have
experienced the same problem when setting rollSize 0, rollCount 0 in order
to disable rolling on defaults. Any other ideas? We want one csv file per
day that is continuously flushed to..

Thanks
Josh
On 17 Jun 2013 17:16, "Paul Chavez" <pchavez@verticalsearchworks.com> wrote:

> There are three file roll defaults on the HDFS sink, rollInterval,
> rollSize and rollCount. Their defaults are 30s, 1024B and 10 ecents,
> respectively. You need to set each one as desired or disable them.
>
> My mail client chopped up your config, but I did a search and didn't see
> any of those properties set. I would start there.
>
> Hope that helps,
> Paul Chavez
>
> -----Original Message-----
> From: Josh Myers [mailto:josh.myers@mydrivesolutions.com]
> Sent: Monday, June 17, 2013 6:47 AM
> To: user@flume.apache.org
> Subject: Flume events rolling file too regularly
>
> Hi guys,
>
> We are sending JSON events from our pipeline into a flume http source.
> We have written a custom multiplexer and sink serializer. The events are
> being routed into the correct channels and consumed OK by the sinks. The
> custom serializer takes a JSON event and outputs a csv. Files are being
> written to s3 ( using s3n as hdfs ) but rather than appending the written
> csv file, each event seems to be generating it own csv. The output is what
> I would expect using rollCount 1, however we do occasionally get several
> events ( maybe 4 ) written per csv. Please see below for config.
>
> Ideally we want to use rollInterval of 24 hours, to generate a new .csv
> file every 24 hours, but have events pretty quickly flushed to the csv file
> after being sent. So one csv' per day that is consistently appended with
> whatever events we throw in. We found however that with a rollInterval of
> 24 hours the events weren't being flushed often enough...
>
> Any help would be hugely appreciated!
>
> Thanks.
>
>
> Josh
>

-- 
www.mydrivesolutions.com

This email and any attachments is private and confidential. If you have 
received this message in error please remove it from your systems and 
notify the author.
MyDrive Solutions Limited is registered in England and Wales, No 07330334. 
Registered office: Surrey Technology Centre, 40 Occam Road, Guildford GU2 
7YG, UK

Mime
View raw message