flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laurance George <laurance.w.geo...@gmail.com>
Subject Re: Import files from a directory on remote machine
Date Thu, 17 Apr 2014 00:51:05 GMT
Agreed with Jeff.  Rsync + cron ( if it needs to be regular) is probably
your best bet to ingest files from a remote machine that you only have read
access to.  But then again you're sorta stepping outside of the use case of
flume at some level here as rsync is now basically a part of your flume
topology.  However, if you just need to back-fill old log data then this is
perfect!  In fact, it's what I do myself.


On Wed, Apr 16, 2014 at 8:46 PM, Jeff Lord <jlord@cloudera.com> wrote:

> The spooling directory source runs as part of the agent.
> The source also needs write access to the files as it renames them upon
> completion of ingest. Perhaps you could use rsync to copy the files
> somewhere that you have write access to?
>
>
> On Wed, Apr 16, 2014 at 5:26 PM, Something Something <
> mailinglists19@gmail.com> wrote:
>
>> Thanks Jeff.  This is useful.  Can the spoolDir be on a different
>> machine?  We may have to setup a different process to copy files into
>> 'spoolDir', right?  Note:  We have 'read only' access to these files.  Any
>> recommendations about this?
>>
>>
>> On Wed, Apr 16, 2014 at 5:16 PM, Jeff Lord <jlord@cloudera.com> wrote:
>>
>>> http://flume.apache.org/FlumeUserGuide.html#spooling-directory-source
>>>
>>>
>>> On Wed, Apr 16, 2014 at 5:14 PM, Something Something <
>>> mailinglists19@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> Needless to say I am newbie to Flume, but I've got a basic flow working
>>>> in which I am importing a log file from my linux box to hdfs.  I am using
>>>>
>>>> a1.sources.r1.command = tail -F /var/log/xyz.log
>>>>
>>>> which is working like a stream of messages.  This is good!
>>>>
>>>> Now what I want to do is copy log files from a directory on a remote
>>>> machine on a regular basis.  For example:
>>>>
>>>> username@machinename:/var/log/logdir/<multiple files>
>>>>
>>>> One way to do it is to simply 'scp' files from the remote directory
>>>> into my box on a regular basis, but what's the best way to do this in
>>>> Flume?  Please let me know.
>>>>
>>>> Thanks for the help.
>>>>
>>>>
>>>>
>>>
>>
>


-- 
Laurance George

Mime
View raw message