We recently added functionality to the file channel integrity tool that can be used to remove bad events from the channel - though you would need to write some code to validate events. It will be in the soon to be released 1.6.0


On Fri, Apr 17, 2015 at 9:05 AM, Tao Li <litao.buptsse@gmail.com> wrote:

Hi all:

My use case is KafkaChannel + HDFSEventSink. 

I found that SinkRunner.PollingRunner will call HDFSEventSink.process() in a while loop. For example, a message in kafka contains dirty data, so HDFSEventSink.process() consume message from kafka, throws exception because of dirty data, and kafka offset doesn't commit. And the outer loop, will continue call HDFSEventSink.process(). Because the kafka offset doesn't change, so HDFSEventSink will consume the dirty data again. The bad loop is never stopped.

I want to know that if we have a mechanism to cover this case? For example, we have a max retry num for a unique HDFSEventSink.process() call and give up when exceed max limit.