flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Wendell <pwend...@gmail.com>
Subject Re: A customer use case / using spoolDir
Date Fri, 07 Dec 2012 00:50:35 GMT
To answer your other questions: The spooling source will pick up files
in the directory, send them with Flume, and rename them to indicate
that they have been transferred. Files that were already in the
directory before you started will be read and sent through Flume. It
treats these like any other files.

- Patrick

On Wed, Dec 5, 2012 at 4:34 AM, Alexander Alten-Lorenz
<wget.null@gmail.com> wrote:
> Hi,
>
> as the error message says:
>> No Channels configured for spooldir-1
>
> add:
> agent1.sources.spooldir-1.channels = MemoryChannel-2
>
> When a file is dropped into the source should pick up them. If are files inside they
will be processed (if I'm not totally wrong)
>
> - Alex
>
>
> On Dec 5, 2012, at 1:00 PM, Emile Kao <emilekao@gmx.net> wrote:
>
>> Hello,
>> thank you for the hint to use the new spoolDir feature in the fresh released 1.3.0
version of Flume.
>>
>> unfortunately I am not getting the expected result.
>> Here is my configuration:
>>
>> agent1.channels = MemoryChannel-2
>> agent1.channels.MemoryChannel-2.type = memory
>>
>> agent1.sources = spooldir-1
>> agent1.sources.spooldir-1.type = spooldir
>> agent1.sources.spooldir-1.spoolDir = /opt/apache2/logs/flumeSpool
>> agent1.sources.spooldir-1.fileHeader = true
>>
>> agent1.sinks = HDFS
>> agent1.sinks.HDFS.channel = MemoryChannel-2
>> agent1.sinks.HDFS.type = hdfs
>> agent1.sinks.HDFS.hdfs.fileType = DataStream
>> agent1.sinks.HDFS.hdfs.path = hdfs://localhost:9000
>> agent1.sinks.HDFS.hdfs.writeFormat = Text
>>
>>
>> Upon start I am getting the following warning:
>> 2012-12-05 11:05:19,216 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)]
Removed spooldir-1 due to No Channels configured for spooldir-1
>>
>> Question:
>>
>> 1) Is something wrong in the above config?
>>
>> 2) How are the files gathered from the spool directory? Every time I drop (copy,
etc...) a file in it?
>>
>> 3) What happens to the files that were already in the spool directory before I start
the flume agent?
>>
>> I would appreciate any Help!
>>
>> Cheers,
>> Emile
>>
>>
>> -------- Original-Nachricht --------
>>> Datum: Tue, 4 Dec 2012 06:48:46 -0800
>>> Von: Mike Percy <mpercy@apache.org>
>>> An: user@flume.apache.org
>>> Betreff: Re: A customer use case
>>
>>> Hi Emile,
>>>
>>> On Tue, Dec 4, 2012 at 2:04 AM, Emile Kao <emilekao@gmx.net> wrote:
>>>>
>>>> 1. Which is the best way to implement such a scenario using Flume/
>>> Hadoop?
>>>>
>>>
>>> You could use the file spooling client / source to stream these files back
>>> in the latest trunk and upcoming Flume 1.3.0 builds, along with hdfs sink.
>>>
>>> 2. The customer would like to keep the log files in thier original state
>>>> (file name, size, etc..). Is it practicable using Flume?
>>>>
>>>
>>> Not recommended. Flume is an event streaming system, not a file copying
>>> mechanism. If you want to do that, just use some scripts with hadoop fs
>>> -put instead of Flume. Flume provides a bunch of stream-oriented features
>>> on top of its event streaming architecture, such as data enrichment
>>> capabilities, event routing, and configurable file rolling on HDFS, to
>>> name
>>> a few.
>>>
>>> Regards,
>>> Mike
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>

Mime
View raw message