flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anatharaman, Srinatha (Contractor)" <Srinatha_Ananthara...@comcast.com>
Subject Ingestion to Solr is very slow
Date Thu, 16 Feb 2017 23:00:01 GMT
Hi,

I have large set of small files , each file is around 7 - 10 K in size
Total I have 350K files with around 6 GB.

I have changed my flume configuration with many options but whatever the config change Solr
takes 2 sec for each file to ingest


agent.sources = SpoolDirSrc
agent.channels = FileChannel
agent.sinks = SolrSink

# Configure Source

agent.sources.SpoolDirSrc.channels = fileChannel
agent.sources.SpoolDirSrc.type = spooldir
agent.sources.SpoolDirSrc.spoolDir = /app/home/solr/final
agent.sources.SpoolDirSrc.basenameHeader = true
#agent.sources.SpoolDirSrc.batchSize = 100000

agent.sources.SpoolDirSrc.fileHeader = true
agent.sources.SpoolDirSrc.deserializer = org.apache.flume.sink.solr.morphline.BlobDeserializer$Builder


# Use a channel that buffers events in memory
agent.channels.FileChannel.type = file
agent.channels.FileChannel.capacity = 1000
agent.channels.FileChannel.transactionCapacity = 1000

#agent.channels.FileChannel.transactionCapacity = 10000

# Configure Solr Sink

agent.sinks.SolrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
agent.sinks.SolrSink.morphlineFile = /etc/flume/conf/morphline.conf
#agent.sinks.SolrSink.batchsize = 100000
#agent.sinks.SolrSink.batchDurationMillis = 5000
agent.sinks.SolrSink.channel = fileChannel
agent.sinks.SolrSink.morphlineId = morphline1
agent.sinks.SolrSink.tika.config = tikaConfig.xml
agent.sinks.SolrSink.rollCount = 0
agent.sinks.SolrSink.rollInterval = 0
agent.sinks.SolrSink.rollsize = 100000000
agent.sinks.SolrSink.idleTimeout = 0
agent.sinks.SolrSink.batchSize = 100000
agent.sinks.SolrSink.txnEventMax = 10000000

agent.sources.SpoolDirSrc.channels = FileChannel
agent.sinks.SolrSink.channel = FileChannel

My Collection is on 2 shards and 1 replication

Kindly let me know how do I make this better

Regards,
~Sri

Mime
View raw message