flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zaenal rifai <togatta.f...@gmail.com>
Subject Re: Flume Topology
Date Fri, 27 Nov 2015 03:43:36 GMT
why not to use avro channel gonzalo ?

On 26 November 2015 at 20:12, Gonzalo Herreros <gherreros@gmail.com> wrote:

> You cannot have multiple processes writing concurrently to the same hdfs
> file.
> What you can do is have a topology where many agents forward to an agent
> that writes to hdfs but you need a channel that allows the single hdfs
> writer to lag behind without slowing the sources.
> A kafka channel might be a good choice.
>
> Regards,
> Gonzalo
>
> On 26 November 2015 at 11:57, yogendra reddy <yogendra.60@gmail.com>
> wrote:
>
>> Hi All,
>>
>> Here's my current flume setup for a hadoop cluster to collect service logs
>>
>> - Run flume agent in each of the nodes
>> - Configure flume sink to write to hdfs and the files end up in this way
>>
>> ..flume/events/node0logfile
>> ..flume/events/node1logfile
>>
>> ..flume/events/nodeNlogfile
>>
>> But I want to be able to write all the logs from multiple agents to a
>> single file in hdfs . How can I achieve this and what would the topology
>> look like.
>> can this be done via collector ? If yes, where can I run the collector
>> and how will this scale for a 1000+ node  cluster.
>>
>> Thanks,
>> Yogendra
>>
>
>

Mime
View raw message