flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Alten-Lorenz <wget.n...@gmail.com>
Subject Re: Need for UDP / Multicast Source
Date Mon, 14 Jan 2013 17:43:39 GMT
Hey Andrew,

for your reference, we have a lot of developer informations in our wiki:

https://cwiki.apache.org/confluence/display/FLUME/Developer+Section
https://cwiki.apache.org/confluence/display/FLUME/Developers+Quick+Hack+Sheet

cheers,
 Alex

On Jan 14, 2013, at 6:37 PM, Hari Shreedharan <hshreedharan@cloudera.com> wrote:

> Hi Andrew, 
> 
> Really happy to hear Wikimedia Foundation is considering Flume. I am fairly sure that
if you find such a source useful, there would definitely be others who find it useful too.
I'd recommend filing a jira and starting a discussion, and then submitting the patch. We would
be happy to review and commit it. 
> 
> 
> Thanks,
> Hari
> 
> -- 
> Hari Shreedharan
> 
> 
> On Monday, January 14, 2013 at 9:29 AM, Andrew Otto wrote:
> 
>> Hi all,
>> 
>> I'm an Systems Engineer at the Wikimedia Foundation, and we're investigating using
Flume for our web request log HDFS imports. We've previously been using Kafka, but have had
to change short term architecture plans in order to get data into HDFS reliably and regularly
soon.
>> 
>> Our current web request logs are available for consumption over a multicast UDP stream.
I could hack something together to try and pipe this into Flume using the existing sources
(SyslogUDPSource, or maybe some combination of socat + NetcatSource), but I'd rather reduce
the number of moving parts. I'd like to consume directly from the multicast UDP stream as
a Flume source.
>> 
>> I coded up proof of concept based on the SyslogUDPSource, mainly just stripping out
the syslog event header extraction, and adding in multicast Datagram connection code. I plan
on cleaning this up, and making this a generic raw UDP source, with multicast being a configuration
option.
>> 
>> My question to you guys is, is this something the Flume community would find useful?
If so, should I open up a JIRA to track this? I've got a fork of the Flume git repo over on
github and will be doing my work there. I'd love to share it upstream if it would be useful.
>> 
>> Thanks!
>> -Andrew Otto
>> Systems Engineer
>> Wikimedia Foundation
>> 
>> 
> 
> 

--
Alexander Alten-Lorenz
http://mapredit.blogspot.com
German Hadoop LinkedIn Group: http://goo.gl/N8pCF


Mime
View raw message