flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Shannon <cshannon...@gmail.com>
Subject Re: flume and hadoop append
Date Wed, 09 Apr 2014 04:00:06 GMT
Not sure what you are trying to do, but the HDFS sink appends. It's just
that you have to determine what your roll-over strategy will be. Instead of
every few minutes, you can set the hdfs.rollInterval=0 (disables) and set
the hdfs.rollSize to however large you want your files before you roll over
to appending to a new file. You can also use hdfs.rollCount to set your
roll-over for a certain number of records. I use rollSize for my roll-over
strategy.


On Tue, Apr 8, 2014 at 8:35 PM, Pritchard, Charles X. -ND <
Charles.X.Pritchard.-ND@disney.com> wrote:

> Exploring the idea of using "append" instead of creating new files with
> HDFS every few minutes.
> Are there particular design decisions / considerations?
>
> There's certainly a history of append with HDFS, mainly, earlier versions
> of Hadoop warn strongly against using file append semantics.
>
>
> -Charles

Mime
View raw message