flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Malouf <malouf.g...@gmail.com>
Subject Re: Writing to HDFS from multiple HDFS agents (separate machines)
Date Fri, 15 Mar 2013 02:30:34 GMT
Paul, I interpreted the host property to be for identifying the host that
an event originates from rather than the host of the sink which writes the
event to HDFS?  Is my understanding correct?

What happens if I am using the NettyAvroRpcClient to feed events from a
different server round robin style to two hdfs writing agents; should I
then NOT set the host property on client side and rely on the interceptor?

On Thu, Mar 14, 2013 at 6:34 PM, Gary Malouf <malouf.gary@gmail.com> wrote:

> To be clear, I am referring to the segregating of data from different
> flume sinks as opposed to the original source of the event.  Having said
> that, it sounds like your approach is the easiest.
> -Gary
> On Thu, Mar 14, 2013 at 5:54 PM, Gary Malouf <malouf.gary@gmail.com>wrote:
>> Hi guys,
>> I'm new to flume (hdfs for that metter), using the version packaged with
>> CDH4 (1.3.0) and was wondering how others are maintaining different file
>> names being written to per HDFS sink.
>> My initial thought is to create a separate sub-directory in hdfs for each
>> sink - though I feel like the better way is to somehow prefix each file
>> with a unique sink id.  Are there any patterns that others are following
>> for this?
>> -Gary

View raw message