flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Hart" <bbh...@bbhart.com>
Subject RE: Preserving syslog information
Date Wed, 04 Jul 2012 19:06:48 GMT
Thanks for the response. When I used logger to generate a 'This is a test.'
message (logger -p daemon.info This is a test), in my local syslog I see
"Jul  4 15:42:42 serverA hart_b: This is a test." but on my Central server
the entire message is "hart_b: This is a test.".  The date and host are
dropped for some reason.  This is the case on 1.1 and 1.3 (rev 1357365).
Recall this is a syslog source, avro sink on one server, to a avro source
and file_roll sink on the other.

If I send BIND messages through Flume 1.3, those do appear complete on both
sides, including the program name:  "named[1361]: 04-Jul-2012 15:41:39.083
queries: client query: www.cloudera.com IN AAAA +
(".  I think we're good there.

Now that I have some full messages flowing courtesy of BIND, I'm hitting
buffer size problems.  I'll start a separate thread for that.

Thanks again,

-----Original Message-----
From: Brent Halsey [mailto:mrbrent@gmail.com] 
Sent: Tuesday, July 03, 2012 1:44 PM
To: flume-user@incubator.apache.org
Subject: Re: Preserving syslog information

It's possible that you've run into FLUME-1277 "Error parsing Syslog rfc 3164
messages with null values".  Basically, the date is skipped, and null values
(hyphens) are interpreted as a null date.  Potential fixes are to use
FLUME-1277's patch, make sure you don't have hyphens in your syslog message,
or change the date format to rfc 5424 style.

The flume syslog parser doesn't extract syslog tags (program name), either.
We've just started patching SyslogUtils.java to pull this out.


On Mon, Jul 2, 2012 at 9:54 PM, Brian Hart <bbhart@bbhart.com> wrote:
> I'm working on a project where DNS & DHCP log data need to be 
> aggregated from 180+ servers spread around the WAN down to one (maybe 
> two) centralized servers.  From the central server(s), I'll need to 
> scp them to another company periodically throughout the day.  It's not 
> critical for each message to reach the central servers, but it'd be really
nice if they did.
> I have some architecture questions, but my blocker right now is that 
> my syslog messages are only coming across to the central server as 
> "<sending
> user>: <log text>" (eg. "hart_b: This is test 1") and I'm losing the 
> user>other
> syslog info like date, hostname, and facility.
> I searching the mailing list and wiki, but I can't figure out how to 
> do this in 1.1.0-incubating.  Syslog on my test DHCP server points to 
> the IP for 'remote1', and you can see the rest in my conf file 
> (below).  I think I'm supposed to use the syslog serializer, but I'm not
clear on how to do that.
> central.channels.ch1.type = memory
> central.sources.avro-source1.channels = ch1 
> central.sources.avro-source1.type = avro 
> central.sources.avro-source1.bind = 
> central.sources.avro-source1.port = 41414
> central.sinks.fileroll_sink1.channel = ch1 
> central.sinks.fileroll_sink1.type = file_roll 
> central.sinks.fileroll_sink1.sink.directory = /opt/logs_from_flume/ 
> central.sinks.fileroll_sink1.sink.rollInterval = 30
> central.channels = ch1
> central.sources = avro-source1
> central.sinks = fileroll_sink1
> # REMOTE NODE 1 - North America
> remote1.channels.ch1.type = memory
> remote1.sources.syslog-source1.channels = ch1 
> remote1.sources.syslog-source1.type = syslogudp 
> remote1.sources.syslog-source1.host = 
> remote1.sources.syslog-source1.port = 514
> remote1.sinks.avro-sink1.channel = ch1 remote1.sinks.avro-sink1.type = 
> avro remote1.sinks.avro-sink1.hostname = 
> remote1.sinks.avro-sink1.port = 41414 
> remote1.sinks.avro-sink1.batch-size = 100
> remote1.channels = ch1
> remote1.sources = syslog-source1
> remote1.sinks = avro-sink1
> -=-=-
> Apologies for asking what might be a basic question, but how can I 
> preserve the syslog info so that it makes it into the rolling files on
> Thanks,
> Brian

View raw message