flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Abbina <Heman...@eiqnetworks.com>
Subject Flume benchmarking with HTTP source & File channel
Date Sat, 14 Nov 2015 07:39:44 GMT
Hi,

We have been trying to validate & benchmark the Flume performance for our production use.

We have configured Flume to have HTTP source, File channel & Kafka sink.
Hardware : 8 Core, 32 GB RAM, CentOS6.5, Disk - 500 GB HDD.
Flume configuration:
svcagent.sources = http-source
svcagent.sinks = kafka-sink1
svcagent.channels = file-channel1

# HTTP source to read receive events on port 5005
svcagent.sources.http-source.type = http
svcagent.sources.http-source.channels = file-channel1
svcagent.sources.http-source.port = 5005
svcagent.sources.http-source.bind = 10.15.1.31

svcagent.sources.http-source.selector.type = multiplexing
svcagent.sources.http-source.selector.header = archival
svcagent.sources.http-source.selector.mapping.true = file-channel1
svcagent.sources.http-source.selector.default = file-channel1
#svcagent.sources.http-source.handler =org.eiq.flume.JSONHandler.HTTPSourceJSONHandler

svcagent.sinks.kafka-sink1.topic = flume-sink1
svcagent.sinks.kafka-sink1.brokerList = 10.15.1.32:9092
svcagent.sinks.kafka-sink1.channel = file-channel1
svcagent.sinks.kafka-sink1.batchSize = 5000

svcagent.channels.file-channel1.type = file
svcagent.channels.file-channel1.checkpointDir=/etc/flume-kafka/checkpoint
svcagent.channels.file-channel1.dataDirs=/etc/flume-kafka/data
svcagent.channels.file-channel1.transactionCapacity=10000
svcagent.channels.file-channel1.capacity=50000
svcagent.channels.file-channel1.checkpointInterval=120000
svcagent.channels.file-channel1.checkpointOnClose=true
svcagent.channels.file-channel1.maxFileSize=536870912
svcagent.channels.file-channel1.use-fast-replay=false

When we tried to stream HTTP data, from multiple clients (around 40 HTTP clients), we could
get a max processing of 600  requests/sec, and not beyond that. Increased the XMX setting
of Flume to 4096.

Even we have tried with a Null Sink (instead of Kafka sink). Did not get much performance
improvements. So, assuming the blockage is the HTTP source & File channel.

Could you please suggest any fine tunings to improve the performance of this setup.

--regards
Hemanth

Mime
View raw message