flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Lord <jl...@cloudera.com>
Subject Re: preserve syslog header in hdfs sink
Date Fri, 28 Mar 2014 19:37:28 GMT
Do you have the appropriate interceptors configured?


On Fri, Mar 28, 2014 at 12:28 PM, Ryan Suarez <
ryan.suarez@sheridancollege.ca> wrote:

> RTFM indicates I need the following sink properties:
>
> ---
> hadoop-t1.sinks.hdfs1.serializer = org.apache.flume.serialization.
> HeaderAndBodyTextEventSerializer
> hadoop-t1.sinks.hdfs1.serializer.columns = timestamp hostname msg
> hadoop-t1.sinks.hdfs1.serializer.format = CSV
> hadoop-t1.sinks.hdfs1.serializer.appendNewline = true
> ---
>
> But I'm still not getting timestamp information.  How would I get hostname
> and timestamp information in the logs?
>
>
> On 14-03-26 3:02 PM, Ryan Suarez wrote:
>
>> Greetings,
>>
>> I'm running flume that's shipped with Hortonworks HDP2 to feed syslogs to
>> hdfs.  The problem is the timestamp and hostname of the event is not logged
>> to hdfs.
>>
>> ---
>> flume@hadoop-t1:~$ hadoop fs -cat /opt/logs/hadoop-t1/2014-03-
>> 26/FlumeData.1395859766307
>> SEQ!org.apache.hadoop.io.LongWritable"org.apache.
>> hadoop.io.BytesWritable??Ak?i<??G??`D??$hTsu[22209]:
>> pam_unix(su:session): session opened for user root by someuser(uid=11111)
>> ---
>>
>> How do I configure the sink to add hostname and timestamp info the the
>> event?
>>
>> Here's my flume-conf.properties:
>>
>> ---
>> flume@hadoop-t1:/etc/flume/conf$ cat flume-conf.properties
>> # Name the components on this agent
>> hadoop-t1.sources = syslog1
>> hadoop-t1.sinks = hdfs1
>> hadoop-t1.channels = mem1
>>
>> # Describe/configure the source
>> hadoop-t1.sources.syslog1.type = syslogtcp
>> hadoop-t1.sources.syslog1.host = localhost
>> hadoop-t1.sources.syslog1.port = 10005
>> hadoop-t1.sources.syslog1.portHeader = port
>>
>> ##HDFS Sink
>> hadoop-t1.sinks.hdfs1.type = hdfs
>> hadoop-t1.sinks.hdfs1.hdfs.path = hdfs://hadoop-t1.mydomain.org:
>> 8020/opt/logs/%{host}/%Y-%m-%d
>> hadoop-t1.sinks.hdfs1.hdfs.batchSize = 1
>>
>> # Use a channel which buffers events in memory
>> hadoop-t1.channels.mem1.type = memory
>> hadoop-t1.channels.mem1.capacity = 1000
>> hadoop-t1.channels.mem1.transactionCapacity = 100
>>
>> # Bind the source and sink to the channel
>> hadoop-t1.sources.syslog1.channels = mem1
>> hadoop-t1.sinks.hdfs1.channel = mem1
>> ---
>>
>> ---
>> flume@hadoop-t1:~$ flume-ng version
>> Flume 1.4.0.2.0.11.0-1
>> Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
>> Revision: fcdc3d29a1f249bef653b10b149aea2bc5df892e
>> Compiled by jenkins on Wed Mar 12 05:11:30 PDT 2014
>> From source with checksum dea9ae30ce2c27486ae7c76ab7aba020
>> ---
>>
>
>

Mime
View raw message