flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mingjie Lai <mjla...@gmail.com>
Subject Re: Events silently dropped in DFO on upstream collector ERROR
Date Thu, 03 Nov 2011 20:45:54 GMT


I think you got 2 issues here:

1) as described in flume-798, you saw interruption exception at 
collectors who have rollsink + dfs sink.

2) agent chain doesn't switch to a backup collector if the primary one 
get to ERROR state.

There is early patch for flume-798 but need to work to push to trunk. 
It's a quite important issue. I may have time to work on it this or next 
week to push it in.

I'm not aware of 2), do you want to file a jira?


On 11/01/2011 02:43 AM, Björn Edström wrote:
> Hello List,
> I have a setup like this:
> agent: source | agentDFOChain("collector1:35853", "collector2:35853")
> collector1: collectorSource(35853) | collectorSink(
> "hdfs://namenode.company.net:54310/flume/", "%Y_%m_%d_%H-")
> collector2: collectorSource(35853) | collectorSink(
> "hdfs://namenode.company.net:54310/flume/", "%Y_%m_%d_%H-")
> If both collectors are running, and then collector1 gets into an ERROR
> state (such as because of the DirectDriver issues discussed on this
> list), events are silently dropped. No fail-over takes place to the
> other node in the chain, and no events are written to disk.
> Otherwise, as long as no collector is in ERROR, everything works as
> expected. If I cleanly shut down collector1, the other collector will
> start receiving traffic (as expected). If I then shut down collector2,
> the agent will start writing data to /var/lib/flume as it can't send
> the data upstream (as expected).
> Is this a known issue?
> Best regards
> Björn Edström

View raw message