flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Metelsky <bob.metel...@gmail.com>
Subject Re: Simple- Just copying plain files into the cluster (hdfs) using flume - possible?
Date Tue, 03 Feb 2015 02:10:56 GMT
Jeff - very cool... Installed it, looks great. Ill have to play with it. Im
afraid this may not be mature enough to use in the enterprise yet. Possibly
It can handle my requirement, maybe Im wrong.  Ill have to play around

Thanks

[image: Inline image 1]

On Mon, Feb 2, 2015 at 7:42 PM, Jeff Lord <jlord@cloudera.com> wrote:

> Bob,
>
> You may want to have a look at Apache Nifi.
>
> http://ingest.tips/2014/12/22/getting-started-with-apache-nifi/
>
> Regards,
>
> Jeff
>
> On Mon, Feb 2, 2015 at 3:49 PM, Bob Metelsky <bob.metelsky@gmail.com>
> wrote:
>
>> Steve - I appreciate you time on this...
>>
>> Yes, I want to use flume to copy .xml  or .whatever files from a server
>> outside the cluster to hdfs. That server does l have flume installed on it
>>
>> Id like the same behavior as "spooling directory" but from a remote
>> machine --> to hdfs
>>
>> So, from all my reading flume looks like it completely designed for
>> streaming "live" logs and program outputs...
>>
>> Doesn't seem to be known for  being a filewatcher and grabbing files as
>> they show up, then shiping and writing to hdfs
>>
>> Of can it?
>>
>> Ok I can think fragmentation with individual "small" files but doesn't
>> "spool directory behaviour" face the same issue?
>>
>> I've done quite a bit of reading but one can easily get into the weeds :)
>> - All I need to do is this simple task.
>>
>> Thanks
>>
>>
>>
>> On Mon, Feb 2, 2015 at 5:17 PM, Steve Morin <steve.morin@gmail.com>
>> wrote:
>>
>>> So you want 1 to 1 replication of the logs to HDFS?
>>>
>>> As a footnote people usually don't do this because the log files are
>>> often too small (think fragmentation) which causes performance problems
>>> when used on Hadoop
>>>
>>> On Feb 2, 2015, at 13:30, Bob Metelsky <bob.metelsky@gmail.com> wrote:
>>>
>>> Hi I have a simple requirement
>>>
>>> on server1 (NOT in the cluster, but has flume installed)
>>> I have a process that constantly generates xml files in a known directory
>>>
>>> I need to transfer them to server2 (IN the hadoop cluster)
>>> and into hdfs as xml files
>>>
>>> from what Im reading avro, thrift rpc, et all - are designed for other
>>> uses
>>>
>>> Is there a way to have flume just copy over plain files? txt, xml...
>>> Im thinking there should be but I cant find it
>>>
>>> The closest I see is the "spooling directory" but that seems to be the
>>> files are already inside the cluster.
>>>
>>> Can flume do this? Is there an example,I've read the flume documentation
>>> and nothing is jumping out
>>>
>>> Thanks!
>>>
>>>
>>
>

Mime
View raw message