flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Van Besien <ja...@ngdata.com>
Subject increase load on tier2 flume agents
Date Thu, 07 Nov 2013 15:10:11 GMT

I have a 2 tier flume setup. Tier 1 are agents that accept incomming 
requests (http source) and put them on (large) file channels. Tier 2 
does a lot of processing on these events (with custom interceptors) and 
a custom sink to store the result in a custom data storage. These tier 2 
agents use a (small) memory channel.

The tier 2 interceptors and data storage are all mostly IO bound.

I seem to struggle to saturate the tier 2 agents. They are slower than 
they should be, mostly due to various flume unrelated reasons.

However, assume that I would like my tier 2 agents to process more 
events in parallel. What would be the appropriate way to do this?

Do I need multiple avro sinks on the tier 1 agents that map to the same 
tier 2 avro source? I tried this, and this seems to increase the number 
of threads on the tier 2 agent that are actually processing events indeed.

Is this the way to do it, or not?


View raw message