flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: High level technical overview of output bucketing for flume (old-gen) ?
Date Fri, 04 Jan 2013 16:37:08 GMT
Thank you but I am afraid I wasn't clear enough.
I have no issue with the configuration and I understand output bucketing.

However, the flume old-gen syslog source do not use the syslog timestamp as
far I understand it from the source. (It only cares about the priority
which is not really a bad decision in itself because that way the
implementation is 'compatible' with both BSD and IETF syslog standards.) I
wrote a sink decorator in order to change that. It reads the syslog header,
uses the syslog timestamp (which is really the time when the log was
generated) and adds a few metadata.
But I have not a full understanding of flume source.

Could anyone point me to where the 'sequences date and times'* *are
interpreted (in flume source ; ie which classes)?

Thanks in advance


On Fri, Jan 4, 2013 at 4:06 PM, Alexander Alten-Lorenz

> Hi Bertrand,
> I have written a blog about in 2011, here you can see for what you can see
> the use of bucketing:
> http://mapredit.blogspot.de/2011/10/centralized-logfile-management-across.html
> You can use the sequences to create directories, based on the sequences
> the timestamp on a syslog event will be delivered. So you have the
> availability to automatically create directories for year, month, day, hour
> or something like that.
> Best,
>  Alex
> On Jan 4, 2013, at 3:22 PM, Bertrand Dechoux <dechouxb@gmail.com> wrote:
> > Hi,
> >
> > I am using flume (old gen) as an extension to an existant syslog system
> and
> > would like to use the timestamp of the syslog message as the timestamp of
> > the flume event.
> > I guess the timestamp is used for the '*Fine grained escape sequences
> date
> > and times*' but I don't have a clear understanding of it.
> > http://archive.cloudera.com/cdh/3/flume/UserGuide/#_output_bucketing
> >
> > Could someone point me to where those sequences (like %d) are
> interpreted?
> > I would like to be sure I am not missing anything obvious.
> >
> > Thanks in advance
> >
> > Bertrand
> >
> > PS : I know an unrelated recommandation would be to use flume-ng but this
> > is not the topic of this email.
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF

Bertrand Dechoux

View raw message