flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Otto <o...@wikimedia.org>
Subject Re: Need for UDP / Multicast Source
Date Mon, 14 Jan 2013 18:01:36 GMT
Thanks guys!  I've opened up a JIRA here:


On Jan 14, 2013, at 12:43 PM, Alexander Alten-Lorenz <wget.null@gmail.com> wrote:

> Hey Andrew,
> for your reference, we have a lot of developer informations in our wiki:
> https://cwiki.apache.org/confluence/display/FLUME/Developer+Section
> https://cwiki.apache.org/confluence/display/FLUME/Developers+Quick+Hack+Sheet
> cheers,
> Alex
> On Jan 14, 2013, at 6:37 PM, Hari Shreedharan <hshreedharan@cloudera.com> wrote:
>> Hi Andrew, 
>> Really happy to hear Wikimedia Foundation is considering Flume. I am fairly sure
that if you find such a source useful, there would definitely be others who find it useful
too. I'd recommend filing a jira and starting a discussion, and then submitting the patch.
We would be happy to review and commit it. 
>> Thanks,
>> Hari
>> -- 
>> Hari Shreedharan
>> On Monday, January 14, 2013 at 9:29 AM, Andrew Otto wrote:
>>> Hi all,
>>> I'm an Systems Engineer at the Wikimedia Foundation, and we're investigating
using Flume for our web request log HDFS imports. We've previously been using Kafka, but have
had to change short term architecture plans in order to get data into HDFS reliably and regularly
>>> Our current web request logs are available for consumption over a multicast UDP
stream. I could hack something together to try and pipe this into Flume using the existing
sources (SyslogUDPSource, or maybe some combination of socat + NetcatSource), but I'd rather
reduce the number of moving parts. I'd like to consume directly from the multicast UDP stream
as a Flume source.
>>> I coded up proof of concept based on the SyslogUDPSource, mainly just stripping
out the syslog event header extraction, and adding in multicast Datagram connection code.
I plan on cleaning this up, and making this a generic raw UDP source, with multicast being
a configuration option.
>>> My question to you guys is, is this something the Flume community would find
useful? If so, should I open up a JIRA to track this? I've got a fork of the Flume git repo
over on github and will be doing my work there. I'd love to share it upstream if it would
be useful.
>>> Thanks!
>>> -Andrew Otto
>>> Systems Engineer
>>> Wikimedia Foundation
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF

View raw message