flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Suarez <ryan.sua...@sheridancollege.ca>
Subject preserve syslog header in hdfs sink
Date Wed, 26 Mar 2014 19:02:44 GMT
Greetings,

I'm running flume that's shipped with Hortonworks HDP2 to feed syslogs 
to hdfs.  The problem is the timestamp and hostname of the event is not 
logged to hdfs.

---
flume@hadoop-t1:~$ hadoop fs -cat 
/opt/logs/hadoop-t1/2014-03-26/FlumeData.1395859766307
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable??Ak?i<??G??`D??$hTsu[22209]:

pam_unix(su:session): session opened for user root by someuser(uid=11111)
---

How do I configure the sink to add hostname and timestamp info the the 
event?

Here's my flume-conf.properties:

---
flume@hadoop-t1:/etc/flume/conf$ cat flume-conf.properties
# Name the components on this agent
hadoop-t1.sources = syslog1
hadoop-t1.sinks = hdfs1
hadoop-t1.channels = mem1

# Describe/configure the source
hadoop-t1.sources.syslog1.type = syslogtcp
hadoop-t1.sources.syslog1.host = localhost
hadoop-t1.sources.syslog1.port = 10005
hadoop-t1.sources.syslog1.portHeader = port

##HDFS Sink
hadoop-t1.sinks.hdfs1.type = hdfs
hadoop-t1.sinks.hdfs1.hdfs.path = 
hdfs://hadoop-t1.mydomain.org:8020/opt/logs/%{host}/%Y-%m-%d
hadoop-t1.sinks.hdfs1.hdfs.batchSize = 1

# Use a channel which buffers events in memory
hadoop-t1.channels.mem1.type = memory
hadoop-t1.channels.mem1.capacity = 1000
hadoop-t1.channels.mem1.transactionCapacity = 100

# Bind the source and sink to the channel
hadoop-t1.sources.syslog1.channels = mem1
hadoop-t1.sinks.hdfs1.channel = mem1
---

---
flume@hadoop-t1:~$ flume-ng version
Flume 1.4.0.2.0.11.0-1
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: fcdc3d29a1f249bef653b10b149aea2bc5df892e
Compiled by jenkins on Wed Mar 12 05:11:30 PDT 2014
 From source with checksum dea9ae30ce2c27486ae7c76ab7aba020
---

Mime
View raw message