I could not easily find the exact log entry that had the issue - as all I had were 30M input log files :).
After further debugging, I figured out what the issue was . Here is what happened.

For production, we use Exec sink with 'tail -f '. For my local testing I use a spooling dir. The issue happened when I was using the spooldir sink, when a log file had non-UTF-8 characters.
However, the exception that I've posted came not from processing the log file! The flow was as following:
1. Flume is started with spooldir sink
2. a log file with non-utf-8 chars is moved into the spooldir
3. Flume starts processing, encounters a "bad" character and stops (no errors or anything)
4. I kill Flume manually and restart - without cleaning out its .flumespool dir
5. FLume starts up and now chokes up processing its own .flumespool dir and the left-over file in there! - this is where the MalformedInputException came from 

When I processed the same file via Exec sink, and 'tail -n 10000 ..' command - it was processed successfully - which told me the issue is specific to the spooled sink.

The solution was to add this parameter to the spooldir sink:
a1.sources.r1.inputCharset = ISO8859-1


From: Jeff Lord <jlord@cloudera.com>
To: "user@flume.apache.org" <user@flume.apache.org>; Marina <ppine7@yahoo.com>
Sent: Monday, March 9, 2015 11:17 AM
Subject: Re: MalformedInputException processing logs from Varnish server

Do you have a sample of the characters/data which you believe to be causing this?
Can you just confirm you are using apache version of flume or a specific distro?
Also in your message you mention that you are using tail -f which would be the exec source but the stack trace looks like you are actually using the spooldir source.



On Mon, Mar 9, 2015 at 10:26 AM, Marina <ppine7@yahoo.com> wrote:
I have configured Flume to "tail -f" logs from my Varnish server - pretty much standard Apache HTTP logs.
However, sometimes Flume chokes on some special characters and dies - stops processing new log entries.

See below for a stack trace.

It seems like this exact issue was reported as Flume bug in 1.4.x version:
and it was marked as resolved in 1.5.0 version.
The version I am using is Flume 1.5.2 - and I am still seeing this issue...

Could somebody confirm/deny if what I am seeing is the same issue and should have been fixed? OR is this completely different?

06 Mar 2015 18:16:57,820 ERROR [pool-3-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:256)  - FATAL: Spool Directory source r1: { spoolDir: /data1/varnish-logs-active }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:195)
at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:238)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)