flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhaskar V. Karambelkar" <bhaska...@gmail.com>
Subject Re: How to transfer csv files, one-to-one
Date Tue, 28 Jul 2015 16:41:04 GMT
If you need to copy a file as is, flume is not the best option for you. You
are better off using other means such as Hadoop commands, or looking in to
mounting HDFS as a NFS share etc.
Flume is best suited for use cases where you want to aggregate events, not
really a file transfer utility.

thanks
B.

On Tue, Jul 28, 2015 at 6:55 AM, Goran Simic <gorsimic@gmail.com> wrote:

> The closest I got is that every line of data from source (that are
> multiple files) is written to just one file on the sink side. I use
> "spooldir" on source side, and "file_roll" on sink side. How can I improve
> this to copy files one-to-one, with the same file name?
>
> This is my setup:
>
> COMP1:
> # Name the components on this agent
> a1.sources = r1
> a1.sinks = k1
> a1.channels = c1
>
> # Describe/configure the source
> a1.sources.r1.type = spooldir
> a1.sources.r1.spoolDir = source
> a1.sources.r1.fileHeader = true    #BTW this should add filename to
> destination side, but not working
>
> # Describe the sink
> a1.sinks.k1.type = avro
> a1.sinks.k1.hostname = [ip_address]
> a1.sinks.k1.port = 9000
>
> # Use a channel which buffers events in memory
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000
> a1.channels.c1.transactionCapacity = 100
>
> # Bind the source and sink to the channel
> a1.sources.r1.channels = c1
> a1.sinks.k1.channel = c1
>
>
> COMP2:
> # Name the components on this agent
> a2.sources = r2
> a2.sinks = k2
> a2.channels = c2
>
> # Describe/configure the source
> a2.sources.r2.type = avro
> a2.sources.r2.bind = 0.0.0.0
> a2.sources.r2.port = 9000
>
> # Describe the sink
> a2.sinks.k2.type = file_roll
> a2.sinks.k2.sink.directory = sink
>
> # Use a channel which buffers events in memory
> a2.channels.c2.type = memory
> a2.channels.c2.capacity = 1000
> a2.channels.c2.transactionCapacity = 100
>
> # Bind the source and sink to the channel
> a2.sources.r2.channels = c2
> a2.sinks.k2.channel = c2
>
> If i understand correctly there is no out-of-the-box solution for my
> example, so should this be done with interceptors, or should I implement
> another sink? Or something else ...
>
> Thanks for your help
>

Mime
View raw message