flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Morin <steve.mo...@gmail.com>
Subject Re: Simple- Just copying plain files into the cluster (hdfs) using flume - possible?
Date Mon, 02 Feb 2015 22:17:52 GMT
So you want 1 to 1 replication of the logs to HDFS?  

As a footnote people usually don't do this because the log files are often too small (think
fragmentation) which causes performance problems when used on Hadoop 

> On Feb 2, 2015, at 13:30, Bob Metelsky <bob.metelsky@gmail.com> wrote:
> Hi I have a simple requirement
> on server1 (NOT in the cluster, but has flume installed)
> I have a process that constantly generates xml files in a known directory
> I need to transfer them to server2 (IN the hadoop cluster)
> and into hdfs as xml files
> from what Im reading avro, thrift rpc, et all - are designed for other uses
> Is there a way to have flume just copy over plain files? txt, xml...
> Im thinking there should be but I cant find it
> The closest I see is the "spooling directory" but that seems to be the files are already
inside the cluster.
> Can flume do this? Is there an example,I've read the flume documentation and nothing
is jumping out
> Thanks!

View raw message