flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: Syslog TCP performances issue with filechannel
Date Thu, 05 Mar 2015 21:42:00 GMT
So if you use the Multiport Syslog Source, you can specify a batch size — which is the size
of a transaction, and there is one fsync at the end of each transaction.




Regarding the tests — those were done over 2 years ago, using the Memory Channel.




Thanks, Hari

On Thu, Mar 5, 2015 at 1:11 AM, Smaine Kahlouch
<smaine.kahlouch@smartjog.com> wrote:

> Actually the batchSize is configured on sink level.
> I didn't find this option on file channel.
> Furthermore, the source batchSize can't be configured because it is a 
> syslog-ng tool which doesn't have this capability.
> I tried with "netcat" source and i face the same behaviour.
> I guess you're right, for each event there's a fsync which causes the 
> heavy load on diks.
> However i've read this topic : 
> https://cwiki.apache.org/confluence/display/FLUME/Flume+NG+Performance+Measurements
> And they didn't have the same problem obviously.
> Regards,
> -- 
> Smaine Kahlouch - Engineer, Research & Engineering
> Arkena | T: +33 1 5868 6196
> 27 Blvd Hippolyte Marquès, 94200 Ivry-sur-Seine, France
> arkena.com
> On 03/04/15 20:08, Hari Shreedharan wrote:
>> You should probably increase the batch size, since each batch causes 
>> an fsync which slows things down.
>>
>> Thanks,
>> Hari
>>
>>
>> On Wed, Mar 4, 2015 at 6:28 AM, Smaine Kahlouch 
>> <smaine.kahlouch@smartjog.com <mailto:smaine.kahlouch@smartjog.com>>

>> wrote:
>>
>>     Hi all,
>>
>>     I'm currently doing benchmarks on flume.
>>     We're planning to use flume with syslogtcp as source and
>>     filechannel in order to have avoid data loss.
>>
>>     The performances are quiet good when a memorychannel is used :
>>     ~*100 000events/sec* (event size = 600bytes)
>>
>>     But as soon as i switch to filechannel the performances drop
>>     drammatically:
>>     ~*300events/sec*
>>
>>     Despite this poor result, the behaviour is really strange because
>>     i have a heavy disk usage (all the disks), near 100%.
>>
>>     I use a tool provided by syslog-ng in order to generate syslog
>>     logs : loggen
>>     <http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-latest-guides/en/syslog-ng-ose-guide-admin/html/loggen.1.html>
>>
>>     ex : loggen -i -I 3000000 --size 600 --active-connections 200
>>     myflumehost 20515
>>
>>
>>     Flume version : 1.5.2
>>     Operating System : Centos 6
>>
>>     Please find my flume configuration enclosed. The filechannel is
>>     spread over 5 disks in order to improve performance.
>>
>>
>>     Could you please help me to configure properly syslogtcp source
>>     with filechannel ?
>>
>>     Regards,
>>
>>     -- 
>>     Smaine Kahlouch - Engineer, Research & Engineering
>>     Arkena | T: +33 1 5868 6196
>>     27 Blvd Hippolyte Marquès, 94200 Ivry-sur-Seine, France
>>     arkena.com
>>
>>     <flume.conf>
>>
>>
Mime
View raw message