flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Flume Topology
Date Thu, 26 Nov 2015 13:12:27 GMT
You cannot have multiple processes writing concurrently to the same hdfs
file.
What you can do is have a topology where many agents forward to an agent
that writes to hdfs but you need a channel that allows the single hdfs
writer to lag behind without slowing the sources.
A kafka channel might be a good choice.

Regards,
Gonzalo

On 26 November 2015 at 11:57, yogendra reddy <yogendra.60@gmail.com> wrote:

> Hi All,
>
> Here's my current flume setup for a hadoop cluster to collect service logs
>
> - Run flume agent in each of the nodes
> - Configure flume sink to write to hdfs and the files end up in this way
>
> ..flume/events/node0logfile
> ..flume/events/node1logfile
>
> ..flume/events/nodeNlogfile
>
> But I want to be able to write all the logs from multiple agents to a
> single file in hdfs . How can I achieve this and what would the topology
> look like.
> can this be done via collector ? If yes, where can I run the collector and
> how will this scale for a 1000+ node  cluster.
>
> Thanks,
> Yogendra
>

Mime
View raw message