I'm using Flume 1.6 to collect nginx log then sink the log to kafka. I want to add hostname of nginx when I sink the log to kafka in order to analyse the web traffic of different hosts.
Here is my flume configuration file:
a1.sources = r1
a1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /data/tmp/cs.log
a1.sources.r1.interceptors = i1
a1.sources.r1.interceptors.i1.type = host
a1.sources.r1.interceptors.i1.hostHeader = hostname
a1.sinks = s1
a1.sinks.s1.channel = c1
a1.sinks.s1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.s1.batchsize = 10
a1.sinks.s1.topic = testflume
a1.sinks.s1.key = test
a1.sinks.s1.requiredAcks = -1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
But it did not work, the key was still null. I could not find the hostname anywhere.
If you know how to solve it, let me know.
Shen Zhun (Allen)
Data Mining at LightnInTheBox.com