flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Quintana, Cesar (C)" <Cesar.Quint...@csaa.com>
Subject RE: Flume duplicating a set of events many (hundreds of) times!
Date Wed, 17 Jun 2015 22:54:57 GMT
Thanks for the response.

Here is the config for the Memory Channel:

agent1.channels.PASPAChannel.type = memory
agent1.channels.PASPAChannel.capacity = 10000
agent1.channels.PASPAChannel.transactionCapactiy = 1000

And here’s the Sink Config:

agent1.sinks.PASPASink.type = com.dynatrace.diagnostics.flume.RollingFileSink
agent1.sinks.PASPASink.sink.rollInterval = 3600
agent1.sinks.PASPASink.sink.rollSize = 50
agent1.sinks.PASPASink.sink.directory = /opt/pa
agent1.sinks.PASPASink.sink.batchSize = 1000
agent1.sinks.PASPASink.sink.serializer = com.dynatrace.diagnostics.btexport.flume.BtPageActionSerializerBuilder

Cesar M. Quintana
Infrastructure Engineer, IT Enterprise Systems Management & Tools

CSAA Insurance Group, a AAA Insurer
5353 W. Bell Road, MS Z01A, Glendale, AZ 85308
(desk) 602-467-7352 (cell) 602-467-7352 (email) Cesar.Quintana@csaa.com<mailto:Cesar.Quintana@csaa.com>
100 years of insurance the AAA way

From: Johny Rufus [mailto:jrufus@cloudera.com]
Sent: Wednesday, June 17, 2015 3:40 PM
To: user@flume.apache.org
Subject: Re: Flume duplicating a set of events many (hundreds of) times!

Looks more like a config issue, can you verify the memory channel transaction capacity ? By
default it is 100, and the error reports the same too.


On Wed, Jun 17, 2015 at 2:55 PM, Quintana, Cesar (C) <Cesar.Quintana@csaa.com<mailto:Cesar.Quintana@csaa.com>>

I have a Memory Channel and a File Sink to a local disk configured to take events. I have
1525 transactions that have been duplicated 583 times in a single hour. It may be more, but
my file rolls every hour, so I only looked at that one. I do know the problem continues happening
until I restart flume, whether it is the same set of events being duplicated or a different

I can’t find any documentation on what “Take list for MemoryTransaction, capacity 100
full” means.
Network issues aren’t playing a role here, since it’s a local disk.
The Channel Capacity was recently increased from 1,000 to 10,000, but that did nothing, and
seeing how the real problem are these duplicates, I can see why that had no tangible effect.
Eventually, the Channel hits 10,000 events and stops accepting data from the Source.
Resources are nowhere close to being strained.
It looks like I’m getting around 30 to 50 events per second when this occurs.
Help, please! ☹

Memory Channel Capacity: 10,000
Memory Channel Transaction Capacity: 1,000

Rolling File Sink Batch Size: 1,000

The below error is the first error that is logged when this starts happening.

16 Jun 2015 13:56:24,279 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160)
 - Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to process transaction
        at org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:218)
        at com.dynatrace.diagnostics.flume.RollingFileSink.process(Unknown Source)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Thread.java:701)
Caused by: org.apache.flume.ChannelException: Take list for MemoryTransaction, capacity 100
full, consider committing more frequently, increasing capacity, or increasing thread count
        at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doTake(MemoryChannel.java:100)
        at org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)
        at org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)
        at org.apache.flume.sink.RollingFileSink.process(RollingFileSink.java:191)

Cesar Quintana

View raw message