the attachment flume.conf is channel and sink config, dumps.txt is thread dumps.
channel type "dual" is a channel type I developped to utilize the merits of memory channel and filechannel. when the volume is not quite big, I use memory channel, when the size of events reach to a percentage of the memory channel capacity, it switch to the filechannel, when volume decrease switch to memory again.

thanks for looking into this.


On Tue, Dec 17, 2013 at 8:54 PM, Brock Noland <brock@cloudera.com> wrote:

Can you take and share 8-10 thread dumps while the sink is taking events "slowly"?

Can you share your machine and file channel configuration?

On Dec 17, 2013 6:28 AM, "Shangan Chen" <chenshangan521@gmail.com> wrote:
we face the same problem, performance of taking events from channel is a severe bottleneck. When there're less events in channel,  problem does not alleviate. following is a log of the metrics of writing to hdfs, writing to 5 files with a batchsize of 200000, take cost the most of the total time.


17 十二月 2013 18:49:28,056 INFO  [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.sink.hdfs.HDFSEventSink.process:489)  - HdfsSink-TIME-STAT sink[sink_hdfs_b] writers[5] eventcount[200000] all[44513] take[38197] append[5647] sync[17] getFilenameTime[371]





On Mon, Nov 25, 2013 at 4:46 PM, Jan Van Besien <janvb@ngdata.com> wrote:
Hi,

Is anybody still looking into this question?

Should I log it in jira such that somebody can look into it later?

thanks,
Jan



On 11/18/2013 11:28 AM, Jan Van Besien wrote:
> Hi,
>
> Sorry it took me a while to answer this. I compiled a small test case
> using only off the shelve flume components that shows what is going on.
>
> The setup is a single agent with http source, null sink and file
> channel. I am using the default configuration as much as possible.
>
> The test goes as follows:
>
> - start the agent without sink
> - run a script that sends http requests in multiple threads to the http
> source (the script simply calls the url http://localhost:8080/?key=value
> over and over a gain, whereby value is a random string of 100 chars).
> - this script does about 100 requests per second on my machine. I leave
> it running for a while, such that the file channel contains about 20000
> events.
> - add the null sink to the configuration (around 11:14:33 in the log).
> - observe the logging of the null sink. You'll see in the log file that
> it takes more than 10 seconds per 1000 events (until about even 5000,
> around 11:15:33)
> - stop the http request generating script (i.e. no more writing in file
> channel)
> - observer the logging of the null sink: events 5000 until 20000 are all
> processed within a few seconds.
>
> In the attachment:
> - flume log
> - thread dumps while the ingest was running and the null sink was enabled
> - config (agent1.conf)
>
> I also tried with more sinks (4), see agent2.conf. The results are the same.
>
> Thanks for looking into this,
> Jan
>
>
> On 11/14/2013 05:08 PM, Brock Noland wrote:
>> On Thu, Nov 14, 2013 at 2:50 AM, Jan Van Besien <janvb@ngdata.com
>> <mailto:janvb@ngdata.com>> wrote:
>>
>>      On 11/13/2013 03:04 PM, Brock Noland wrote:
>>       > The file channel uses a WAL which sits on disk.  Each time an
>>      event is
>>       > committed an fsync is called to ensure that data is durable. Without
>>       > this fsync there is no durability guarantee. More details here:
>>       > https://blogs.apache.org/flume/entry/apache_flume_filechannel
>>
>>      Yes indeed. I was just not expecting the performance impact to be
>>      that big.
>>
>>
>>       > The issue is that when the source is committing one-by-one it's
>>       > consuming the disk doing an fsync for each event.  I would find a
>>      way to
>>       > batch up the requests so they are not written one-by-one or use
>>      multiple
>>       > disks for the file channel.
>>
>>      I am already using multiple disks for the channel (4).
>>
>>
>> Can you share your configuration?
>>
>>      Batching the
>>      requests is indeed what I am doing to prevent the filechannel to be the
>>      bottleneck (using a flume agent with a memory channel in front of the
>>      agent with the file channel), but it inheritely means that I loose
>>      end-to-end durability because events are buffered in memory before being
>>      flushed to disk.
>>
>>
>> I would be curious to know though if you doubled the sinks if that would
>> give more time to readers. Could you take three-four thread dumps of the
>> JVM while it's in this state and share them?
>>
>




--
have a good day! 
chenshang'an




--
have a good day! 
chenshang'an