flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Programmatically write files into HDFS with Flume
Date Tue, 30 Apr 2013 18:00:44 GMT
I am looking at options in Java programs that can write files into HDFS
with the following requirements.

1) Transaction Support: Each file, when being written, either fully written
successfully or failed totally without any partial file blocks written.

2) Compression Support/File Formats: Can specify compression type or file
format when writing contents.

I know how to write data into a file on HDFS by opening a FSDataOutputStream
 shown here<http://stackoverflow.com/questions/13457934/writing-to-a-file-in-hdfs-in-hadoop>.
Just wondering if there is some libraries of out of the box solutions that
provides the support I mentioned above.

I stumbled upon Flume, which provides HDFS sink that can support
transaction, compression, file rotation, etc. But it doesn't seem to
provide an API to be used as a library. The features Flume provides are
highly coupled with the Flume architectural components, like source,
channel, and sinks and doesn't seem to be usable independently. All I need
is merely on the HDFS loading part.

Does anyone have some good suggestions?

Chen Song

View raw message