There was recently added a HeaderAndText serializer to the Flume Core, which is available in the latest Git snapshot (git clone You will need to follow directions in the developer documentation to compile that download. This documentation is actually a little incomplete; there are memory issues when trying to compile Flume (due to the documentation), so the below command is the best way to go about compiling Flume (run from the Flume directory):

export JAVA_HOME=<your java home> ; export MAVEN_OPTS="-Xmx512M -XX:MaxPermSize=512M" ; mvn package -DskipTests

This will produce two .tar.gz files in flume/target; you will want to unpack and use the binaries one. Then you will be able to run the latest Flume which includes the HeaderAndText serializer, which will write out (you guessed it) the header and text of a log message to HDFS by setting the serializer property like so:

...serializer = HEADER_AND_TEXT

This will then output the following line to HDFS:

{header1=value1,header2=value2,...headerN=valueN} <log message>

If you want to write it in a specific way, then you will need to create your own Serializer. This involves basically copying the from flume-ng-core/src/main/java/org/apache/flume/serialization/ and modifying the process(Event e) method (and change the package name). Then you will compile it into a jar, add that jar to flume's classpath (either through conf/ or by placing the jar in flume/lib), and then your serializer property should look like so:

...serializer = <FQCN (fully qualified class name: package+ClassName) of your new class>$Builder

I think you can implement your custome sink, where you can take event body and header if any to HDFS...

On Fri, Jan 11, 2013 at 2:53 PM, Chhaya Vishwakarma <> wrote:



How can I write custom serializer to write event body and header to HDFS now I am getting only log messages which are written on HDFS. Timestamp and other information is not coming.




