flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: How to handle ChannelFullException
Date Thu, 29 Jan 2015 19:45:37 GMT
How many sinks do you have? Adding more sinks increases parallelism and will clear the channel
faster, provided the downstream system can handle the load.

Thanks, Hari

On Thu, Jan 29, 2015 at 9:41 AM, Sverre Bakke <sverre.bakke@gmail.com>

> Hi,
> Thanks for your feedback. I can of course switch to the multiport one
> if the plain one is not maintained.
> Back to the ChannelFullException issue. I can increase the channel
> size, but the basic problem remains.. as long as the syslog client is
> faster than the Flume sink, then this exception will occur and data
> would be lost... I really believe that blocking so that the syslog
> client must wait to send more data is the way to go for a robust
> solution.
> Lets assume that the syslog client reads batches of events e.g. from
> file and send these as fast as possible to the Flume multiport tcp
> syslog source. In such cases, the average event per second rate would
> be medium, while in practice, there would be huge spikes where the
> client would deliver as fast as possible. Instead of asking the client
> to "slow down", Flume would accept the events and drop them. This
> forces me as an admin to monitor the logs and try to guess which
> events were dropped. If this happens, I can have a reliable and
> persistent channel configured, but events will still be dropped thus
> undermining the entire solution.
> On Thu, Jan 29, 2015 at 4:56 PM, Jeff Lord <jlord@cloudera.com> wrote:
>> Have you considered increasing the size of the memory channel? I haven't
>> played with Kafka sink much but in regards to hdfs we often add sinks which
>> can help to increase the flow of the channel.
>> The multi port Syslog source is the way to go here as it will give better
>> performance. We should probably go ahead and deprecate the vanilla syslog
>> source.
>> On Thursday, January 29, 2015, Sverre Bakke <sverre.bakke@gmail.com> wrote:
>>> Hi,
>>> I have a syslogtcp source using a default memory channel and Kafka
>>> sink. When producing data as fast as possible (3000 syslog events in a
>>> second), the source seems to accept all the data, but will crash due
>>> to ChannelFullException when adding the event to the channel.
>>> Is there any way to throttle or otherwise wait receiving more syslog
>>> events before channel is available again rather than crashing because
>>> the channel is full? I would prefer that Flume would accept syslog
>>> events slower rather than crashing and dropping events.
>>> 29 Jan 2015 16:26:56,721 ERROR [New I/O  worker #2]
>>> (org.apache.flume.source.SyslogTcpSource$syslogTcpHandler.messageReceived:94)
>>>  - Error writting to channel, event dropped
>>> Also, the syslogtcp seems to keep the syslog headers regardless of the
>>> keepFields setting, is there any common reason for why this might
>>> happen? In contrast, the multiport syslog tcp listener works as
>>> expected with this particular setting.
View raw message