flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Souvik Bose <souvik.b...@delgence.com>
Subject Exception Handling with Flume
Date Mon, 08 Dec 2014 11:34:52 GMT
Hello All,
I am stuck with a problem with flume version 1.4.0. I am using 
spooldirectory source with a custom interceptor to process encoded gps 
files and save it in hdfs and solr (using morphline solr sink). The main 
informtion is stored on the file name itself which is coming in on the 
spool directory and the content is irrelevant. So I am using the custom 
interceptor to extract and transform the file header and store the 
extracted data in Json format as the output of the event.
My problem comes in:

1. When there is a 0 byte file comes in (generally files come in with a 
"!" symbol in the content) flume stops and throws an exception. We don't 
need the content of the file in any case, but still face exception as 
flume cannot handle 0 byte files.
2. When there is content with some weird characters like !f!, flume 
stops with exception
3. Even when everything is running fine, I am losing some data/ events. 
On closer introspection I found that some are available in hdfs but not 
in solr and vice versa. I am not using any processor sinkgroups like 
failover or load balancing. Is it because of that?

I want to achieve a solution where I can handle any exceptions and the 
file/data which causes the exception is discarded and flume processes 
the next file in the spool directory. The date comes in at high velocity 
100 files every seconds. So manually deleting the file and retstarting 
flume is the regular practice I do to keep everything back on track. But 
I am sure there must be some better ways to handle this case. Can you 
guys please suggests some better alternatives for my approach please//?/

Thanks & Regards,
Souvik Bose

View raw message