flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Wise <m...@nextdoor.com>
Subject Re: How to get a bad message out of the channel?
Date Thu, 23 May 2013 16:49:21 GMT
Mike,
  We do that .. but somehow we had ended up with an event or two in the pipeline that was
bad. It would be really nice if there was some way to choose what to do when a bad event was
found -- rather than letting the pipeline fill up quickly. Ie..
   a) Dump the event to a data file and throw a warning in the log messages?
   b) Throw the event away
   c) Move the event to an alternate channel where it can be handled differently 

  Anything other than "stop pulling data from the channel and let the channel fill"

--Matt

On May 22, 2013, at 12:39 AM, Mike Percy <mpercy@apache.org> wrote:

> Hi Matt,
> Nope, there is currently no way to do that. But you could use the timestamp interceptor
to make sure your events always have those headers.
> 
> Mike
> 
> 
> On Mon, May 13, 2013 at 12:13 PM, Matt Wise <matt@nextdoor.com> wrote:
> Great, thats working.. thank you. Is there a way to give the HDFS plugin a 'failsafe'
path to write messages to when they are missing that kind of data?
> 
> --Matt
> 
> On May 10, 2013, at 6:30 PM, Mike Percy <mpercy@apache.org> wrote:
> 
> > Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the configured path.
> >
> > HTH,
> > Mike
> >
> > On May 10, 2013, at 11:00 AM, Matt Wise <matt@nextdoor.com> wrote:
> >
> >> Eek, this was worse than I thought. Turns out message continued to be added
to the channels, but no transactions could complete to take messages out of the channel. I've
moved the file channels out of the way and restarted the service for now ... but how can I
recover the rest of the data in these filechannels?
> >>
> >> On May 10, 2013, at 10:29 AM, Matt Wise <matt@nextdoor.com> wrote:
> >>
> >>> We were messing around with a few settings today and ended up getting a
few messages into our channel that are bad (corrupt time field). How can I clear them out?
> >>>
> >>>> 10 May 2013 17:28:26,920 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.SinkRunner$PollingRunner.run:160)  - Unable to deliver event. Exception
follows.
> >>>> org.apache.flume.EventDeliveryException: java.lang.RuntimeException:
Flume wasn't able to parse timestamp header in the event to resolve time based bucketing.
Please check that you're correctly populating timestamp header (for example using TimestampInterceptor
source interceptor).
> >>>>   at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
> >>>>   at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> >>>>   at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> >>>>   at java.lang.Thread.run(Thread.java:679)
> >>>> Caused by: java.lang.RuntimeException: Flume wasn't able to parse timestamp
header in the event to resolve time based bucketing. Please check that you're correctly populating
timestamp header (for example using TimestampInterceptor source interceptor).
> >>>>   at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
> >>>>   at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
> >>>>   at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
> >>>>   ... 3 more
> >>>> Caused by: java.lang.NumberFormatException: null
> >>>>   at java.lang.Long.parseLong(Long.java:401)
> >>>>   at java.lang.Long.valueOf(Long.java:535)
> >>>>   at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
> >>>>   ... 5 more
> >>>
> >>> This message just keeps repeating over and over again.. new events are coming
through just fine.
> >>
> 
> 


Mime
View raw message