flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Shreedharan" <hshreedha...@cloudera.com>
Subject Re: Exception Handling with Flume
Date Mon, 08 Dec 2014 18:06:43 GMT
You are likely reading from the same channel for both sinks. That means only one sink gets
your data. You’d need to have 2 channels connected to the same source and each sink get
its own channel. 

About the Spool Dir not processing data, what format/serializer etc are you using?


On Mon, Dec 8, 2014 at 3:37 AM, Souvik Bose <souvik.bose@delgence.com>

> Hello All,
> I am stuck with a problem with flume version 1.4.0. I am using 
> spooldirectory source with a custom interceptor to process encoded gps 
> files and save it in hdfs and solr (using morphline solr sink). The main 
> informtion is stored on the file name itself which is coming in on the 
> spool directory and the content is irrelevant. So I am using the custom 
> interceptor to extract and transform the file header and store the 
> extracted data in Json format as the output of the event.
> My problem comes in:
> 1. When there is a 0 byte file comes in (generally files come in with a 
> "!" symbol in the content) flume stops and throws an exception. We don't 
> need the content of the file in any case, but still face exception as 
> flume cannot handle 0 byte files.
> 2. When there is content with some weird characters like !f!, flume 
> stops with exception
> 3. Even when everything is running fine, I am losing some data/ events. 
> On closer introspection I found that some are available in hdfs but not 
> in solr and vice versa. I am not using any processor sinkgroups like 
> failover or load balancing. Is it because of that?
> I want to achieve a solution where I can handle any exceptions and the 
> file/data which causes the exception is discarded and flume processes 
> the next file in the spool directory. The date comes in at high velocity 
> 100 files every seconds. So manually deleting the file and retstarting 
> flume is the regular practice I do to keep everything back on track. But 
> I am sure there must be some better ways to handle this case. Can you 
> guys please suggests some better alternatives for my approach please//?/
> Thanks & Regards,
> Souvik Bose
> ///
View raw message