From flume-user-return-369-apmail-incubator-flume-user-archive=incubator.apache.org@incubator.apache.org Mon Oct 17 21:13:04 2011 Return-Path: X-Original-To: apmail-incubator-flume-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-flume-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AD978959F for ; Mon, 17 Oct 2011 21:13:04 +0000 (UTC) Received: (qmail 99876 invoked by uid 500); 17 Oct 2011 21:13:04 -0000 Delivered-To: apmail-incubator-flume-user-archive@incubator.apache.org Received: (qmail 99855 invoked by uid 500); 17 Oct 2011 21:13:04 -0000 Mailing-List: contact flume-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: flume-user@incubator.apache.org Delivered-To: mailing list flume-user@incubator.apache.org Received: (qmail 99847 invoked by uid 99); 17 Oct 2011 21:13:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Oct 2011 21:13:04 +0000 X-ASF-Spam-Status: No, hits=4.0 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cgandevia@gmail.com designates 209.85.214.47 as permitted sender) Received: from [209.85.214.47] (HELO mail-bw0-f47.google.com) (209.85.214.47) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Oct 2011 21:12:59 +0000 Received: by bkat8 with SMTP id t8so6274522bka.6 for ; Mon, 17 Oct 2011 14:12:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=RdYgXsJnCRrSfU6glgrGOrwCcNZmORJKklTf6sGX7ZI=; b=icYkjTE3DEQxcQhQ6a8UhVjvOrYE/k8br3fqZcW3l4X6zsmh1sJoVRwBPpjZPvovxH aLJfyA5J/4Im4eud3g4hCCqsCdwqaM76UH5HChsKqWZe+ZA3DMTSmqG676YrH9Dn03hW YpDk1jmbdthzLggHKOU/Nye/Wo8mIO6E9spvY= Received: by 10.204.136.12 with SMTP id p12mr15741225bkt.26.1318885958114; Mon, 17 Oct 2011 14:12:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.42.75 with HTTP; Mon, 17 Oct 2011 14:11:58 -0700 (PDT) In-Reply-To: References: From: Cameron Gandevia Date: Mon, 17 Oct 2011 14:11:58 -0700 Message-ID: Subject: Re: Weird interrupted exception in DirectDriver during append To: flume-user@incubator.apache.org Content-Type: multipart/alternative; boundary=000e0cd61e3c085cb304af8511c0 --000e0cd61e3c085cb304af8511c0 Content-Type: text/plain; charset=UTF-8 I have been experiencing a similar error and notice it only happens when I have a large number of files open to hdfs. I am running some tests removing the bucketing to see if I can send the same files. I will let you know if I come across anything. I have around 50 nodes writing to a single collector. This error consistently happens within 10 mins of starting my collector. My exception. 2011-10-17 17:30:07,173 [Roll-TriggerThread-0] INFO com.cloudera.flume.handlers.hdfs.CustomDfsSink - done writing raw file to hdfs 2011-10-17 17:30:07,189 [logicalNode collector0_log_dir-19] ERROR com.cloudera.flume.core.connector.DirectDriver - Closing down due to exception during append calls 2011-10-17 17:30:07,190 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.core.connector.DirectDriver - Connector logicalNode collector0_log_dir-19 exited with error: Blocked append interrupted by rotation event java.lang.InterruptedException: Blocked append interrupted by rotation event at com.cloudera.flume.handlers.rolling.RollSink.append(RollSink.java:209) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.core.MaskDecorator.append(MaskDecorator.java:43) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.debug.InsistentOpenDecorator.append(InsistentOpenDecorator.java:169) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:71) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:113) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81) at com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:222) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.core.extractors.DateExtractor.append(DateExtractor.java:129) at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) at com.cloudera.flume.core.extractors.RegexExtractor.append(RegexExtractor.java:88) at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:133) 2011-10-17 17:30:07,191 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.collector.CollectorSource - closed 2011-10-17 17:30:08,191 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.handlers.thrift.ThriftEventSource - Closed server on port 36892... 2011-10-17 17:30:08,191 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.handlers.thrift.ThriftEventSource - Queue still has 1000 elements ... 2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] WARN com.cloudera.flume.handlers.thrift.ThriftEventSource - Close timed out due to no progress. Closing despite having 1000 values still enqueued 2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.handlers.rolling.RollSink - closing RollSink 'escapedCustomDfs("hdfs:// van-mang-perf-hadoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{datehr}00","raw-%{rolltag}" )' 2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] INFO com.cloudera.flume.handlers.rolling.RollSink - double close 'escapedCustomDfs("hdfs:// van-mang-perf-hadoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{datehr}00","raw-%{rolltag}" )' 2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] ERROR com.cloudera.flume.core.connector.DirectDriver - Exiting driver logicalNode collector0_log_dir-19 in error state CollectorSource | RegexExtractor because Blocked append interrupted by rotation event On Mon, Oct 17, 2011 at 12:47 PM, Stephen Layland wrote: > Hi, we're not actually using flume nodes, but just the collector at the > moment. We're listening on a syslog port and dumping straight to HDFS for > now. After some digging, I'm pretty sure it's related to this: > > https://issues.apache.org/jira/browse/FLUME-757 > > -Steve > > > On Mon, Oct 17, 2011 at 11:53 AM, AD wrote: > >> weird i have been seeing the same thing. Do you have the node and >> collector on different hosts? Are you using hbase by chance? >> >> >> On Mon, Oct 17, 2011 at 1:48 PM, Stephen Layland < >> stephen.layland@gmail.com> wrote: >> >>> Hi, after letting flume idle for the weekend listening in on a small >>> stream of live data, we noticed several of our flume collector nodes failing >>> with InterruptedException's being thrown. Logs have errors that look >>> something like below. Any idea of what's going on here and how to fix it? >>> >>> 2011-10-17 08:02:02,107 INFO >>>> com.cloudera.flume.handlers.debug.StubbornAppendSink: append Interrupted >>>> event 'flume-node.lindenlab.com' [INFO Mon Oct 17 08:01:59 UTC 2011] { >>>> syslogfacility : 16 } { syslogseverity : 6 } Oct 17 01:01:59 SOME MESSAGE' >>>> with error: Blocked append interrupted by rotation event >>>> 2011-10-17 08:02:02,107 INFO >>>> com.cloudera.flume.handlers.rolling.RollSink: closing RollSink >>>> 'escapedCustomDfs("hdfs://master-node:54310/logs/raw/%Y/%m/%d/%H00","test%{rolltag}" >>>> )' >>>> 2011-10-17 08:02:02,109 ERROR >>>> com.cloudera.flume.core.connector.DirectDriver: Closing down due to >>>> exception during append calls >>>> java.lang.InterruptedException >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1223) >>>> at >>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:976) >>>> at >>>> com.cloudera.flume.handlers.rolling.RollSink.close(RollSink.java:296) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentOpenDecorator.close(InsistentOpenDecorator.java:175) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:78) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:113) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81) >>>> at >>>> com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:222) >>>> at >>>> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:110) >>>> 2011-10-17 08:02:02,109 INFO >>>> com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode >>>> node6-22 exited with error: nulljava.lang.InterruptedException >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1223) >>>> >>>> at >>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:976) >>>> at >>>> com.cloudera.flume.handlers.rolling.RollSink.close(RollSink.java:296) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentOpenDecorator.close(InsistentOpenDecorator.java:175) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:78) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:113) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81) >>>> at >>>> com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:222) >>>> at >>>> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:110) >>>> 2011-10-17 08:02:02,109 INFO >>>> com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode >>>> node6-22 exited with error: null >>>> java.lang.InterruptedException >>>> at >>>> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1223) >>>> at >>>> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:976) >>>> at >>>> com.cloudera.flume.handlers.rolling.RollSink.close(RollSink.java:296) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentOpenDecorator.close(InsistentOpenDecorator.java:175) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67) >>>> at >>>> com.cloudera.flume.handlers.debug.StubbornAppendSink.append(StubbornAppendSink.java:78) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendDecorator.java:110) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.endtoend.AckChecksumChecker.append(AckChecksumChecker.java:113) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62) >>>> at >>>> com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60) >>>> at >>>> com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81) >>>> at >>>> com.cloudera.flume.collector.CollectorSink.append(CollectorSink.java:222) >>>> at >>>> com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(DirectDriver.java:110) >>> >>> >>> Many thanks, >>> >>> -Steve >>> >> >> > -- Thanks Cameron Gandevia --000e0cd61e3c085cb304af8511c0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I have been experiencing a similar error and notice it only happens when I = have a large number of files open to hdfs. I am running some tests removing= the bucketing to see if I can send the same files. I will let you know if = I come across anything. I have around 50 nodes writing to a single collecto= r. This error consistently happens within 10 mins of starting my collector.=

My exception.

2011-10-17 17= :30:07,173 [Roll-TriggerThread-0] INFO =C2=A0com.cloudera.flume.handlers.hd= fs.CustomDfsSink - done writing raw file to hdfs
2011-10-17 17:30= :07,189 [logicalNode collector0_log_dir-19] ERROR com.cloudera.flume.core.c= onnector.DirectDriver - Closing down due to exception during append calls
2011-10-17 17:30:07,190 [logicalNode collector0_log_dir-19] INFO =C2= =A0com.cloudera.flume.core.connector.DirectDriver - Connector logicalNode c= ollector0_log_dir-19 exited with error: Blocked append interrupted by rotat= ion event
java.lang.InterruptedException: Blocked append interrupted by rotation= event
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers= .rolling.RollSink.append(RollSink.java:209)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecora= tor.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.MaskDecorator.a= ppend(MaskDecorator.java:43)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.c= loudera.flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.debug.In= sistentOpenDecorator.append(InsistentOpenDecorator.java:169)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecora= tor.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.handlers.debug.StubbornAppendSink.append(Stubborn= AppendSink.java:71)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.f= lume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.debug.Insis= tentAppendDecorator.append(InsistentAppendDecorator.java:110)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.appe= nd(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.endtoend.Ac= kChecksumChecker.append(AckChecksumChecker.java:113)
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.append(Even= tSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloude= ra.flume.handlers.batch.UnbatchingDecorator.append(UnbatchingDecorator.java= :62)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecora= tor.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecor= ator.java:81)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.c= ollector.CollectorSink.append(CollectorSink.java:222)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecora= tor.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.core.extractors.DateExtractor.append(DateExtracto= r.java:129)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.cor= e.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.extractors.Rege= xExtractor.append(RegexExtractor.java:88)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 at com.cloudera.flume.core.connector.DirectDriver$PumperThread.run(D= irectDriver.java:133)
2011-10-17 17:30:07,191 [logicalNode collec= tor0_log_dir-19] INFO =C2=A0com.cloudera.flume.collector.CollectorSource - = closed
2011-10-17 17:30:08,191 [logicalNode collector0_log_dir-19] INFO =C2= =A0com.cloudera.flume.handlers.thrift.ThriftEventSource - Closed server on = port 36892...
2011-10-17 17:30:08,191 [logicalNode collector0_log= _dir-19] INFO =C2=A0com.cloudera.flume.handlers.thrift.ThriftEventSource - = Queue still has 1000 elements ...
2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] WARN =C2= =A0com.cloudera.flume.handlers.thrift.ThriftEventSource - Close timed out d= ue to no progress. =C2=A0Closing despite having 1000 values still enqueued<= /div>
2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] INFO =C2=A0com.= cloudera.flume.handlers.rolling.RollSink - closing RollSink 'escapedCus= tomDfs("hdfs://van-mang-perf-h= adoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{date= hr}00","raw-%{rolltag}" )'
2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] INFO =C2= =A0com.cloudera.flume.handlers.rolling.RollSink - double close 'escaped= CustomDfs("hdfs://van-mang-per= f-hadoop-namenode1.net:8020/rawLogs/%{dateyear}-%{datemonth}-%{dateday}/%{d= atehr}00","raw-%{rolltag}" )'
2011-10-17 17:30:18,200 [logicalNode collector0_log_dir-19] ERROR com.= cloudera.flume.core.connector.DirectDriver - Exiting driver logicalNode col= lector0_log_dir-19 in error state CollectorSource | RegexExtractor because = Blocked append interrupted by rotation event


On Mon, Oct 17, 2011 at= 12:47 PM, Stephen Layland <stephen.layland@gmail.com> wrote:
Hi, we're not actually using flume nodes, but just the collector at the= moment. =C2=A0We're listening on a syslog port and dumping straight to= HDFS for now. =C2=A0After some digging, I'm pretty sure it's relat= ed to this:

=

-Steve

=
On Mon, Oct 17, 2011 at 11:53 AM, AD <s= traightflush@gmail.com> wrote:
weird=C2=A0i have been seeing the same thing= . =C2=A0Do you have the node and collector on different hosts? =C2=A0Are yo= u using hbase by chance?


On Mon, Oct 17, 2011 at = 1:48 PM, Stephen Layland <stephen.layland@gmail.com>= wrote:
Hi, after letting flume idle for the weekend= listening in on a small stream of live data, we noticed several of our flu= me collector nodes failing with InterruptedException's being thrown. = =C2=A0Logs have errors that look something like below. =C2=A0Any idea of wh= at's going on here and how to fix it?

flume-node.lindenlab.com' [INFO Mon Oct 17= 08:01:59 UTC 2011] { syslogfacility : 16 } { syslogseverity : 6 } Oct 17 0= 1:01:59 SOME MESSAGE' with error: Blocked append interrupted by rotatio= n event
2011-10-17 08:02:02,107 INFO com.cloudera.flume.handlers.rolling.RollSink: = closing RollSink 'escapedCustomDfs("hdfs://master-node:54310/logs/= raw/%Y/%m/%d/%H00","test%{rolltag}" )'
2011-10-17 08:= 02:02,109 ERROR com.cloudera.flume.core.connector.DirectDriver: Closing dow= n due to exception during append calls
java.lang.InterruptedException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.= concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedS= ynchronizer.java:1223)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurre= nt.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.ja= va:976)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.rolling.RollSink= .close(RollSink.java:296)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.fl= ume.core.EventSinkDecorator.close(EventSinkDecorator.java:67)
=C2=A0=C2= =A0 =C2=A0 =C2=A0 =C2=A0at com.cloudera.flume.core.EventSinkDecorator.close= (EventSinkDecorator.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.debug.InsistentO= penDecorator.close(InsistentOpenDecorator.java:175)
=C2=A0=C2=A0 =C2=A0 = =C2=A0 =C2=A0at com.cloudera.flume.core.EventSinkDecorator.close(EventSinkD= ecorator.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.hand= lers.debug.StubbornAppendSink.append(StubbornAppendSink.java:78)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at com.cloudera.flume.core.EventSinkDecora= tor.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at co= m.cloudera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAp= pendDecorator.java:110)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at com.cloudera= .flume.core.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.endtoend.AckChec= ksumChecker.append(AckChecksumChecker.java:113)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator= .java:60)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at com.cloudera.flume.handler= s.batch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.a= ppend(EventSinkDecorator.java:60)=C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at c= om.cloudera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.jav= a:81)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.collector.Collec= torSink.append(CollectorSink.java:222)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.connector.DirectDriv= er$PumperThread.run(DirectDriver.java:110)
2011-10-17 08:02:02,109 INFO = com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode node6= -22 exited with error: nulljava.lang.InterruptedException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.locks.AbstractQueuedSyn= chronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1223) =C2=A0 =C2= =A0 =C2=A0 =C2=A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurrent.lo= cks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:97= 6)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.rolling.RollSink= .close(RollSink.java:296)
=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0at com.cloude= ra.flume.core.EventSinkDecorator.close(EventSinkDecorator.java:67)
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.clos= e(EventSinkDecorator.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.debug.InsistentO= penDecorator.close(InsistentOpenDecorator.java:175)
=C2=A0 =C2=A0 =C2=A0= =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecora= tor.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.= debug.StubbornAppendSink.append(StubbornAppendSink.java:78)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.a= ppend(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.clo= udera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendD= ecorator.java:110)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.cor= e.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.endtoend.AckChec= ksumChecker.append(AckChecksumChecker.java:113)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator= .java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.bat= ch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.a= ppend(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.clo= udera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81)<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.collector.CollectorSin= k.append(CollectorSink.java:222)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.connector.DirectDriv= er$PumperThread.run(DirectDriver.java:110)
2011-10-17 08:02:02,109 INFO = com.cloudera.flume.core.connector.DirectDriver: Connector logicalNode node6= -22 exited with error: null
java.lang.InterruptedException
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.= concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedS= ynchronizer.java:1223)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.util.concurre= nt.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.ja= va:976)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.rolling.RollSink= .close(RollSink.java:296)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.fl= ume.core.EventSinkDecorator.close(EventSinkDecorator.java:67)
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.close(Event= SinkDecorator.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.debug.InsistentO= penDecorator.close(InsistentOpenDecorator.java:175)
=C2=A0 =C2=A0 =C2=A0= =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.close(EventSinkDecora= tor.java:67)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.= debug.StubbornAppendSink.append(StubbornAppendSink.java:78)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.a= ppend(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.clo= udera.flume.handlers.debug.InsistentAppendDecorator.append(InsistentAppendD= ecorator.java:110)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.cor= e.EventSinkDecorator.append(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.endtoend.AckChec= ksumChecker.append(AckChecksumChecker.java:113)
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 at com.cloudera.flume.core.EventSinkDecorator.append(EventSinkDecorator= .java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.handlers.bat= ch.UnbatchingDecorator.append(UnbatchingDecorator.java:62)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.EventSinkDecorator.a= ppend(EventSinkDecorator.java:60)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.clo= udera.flume.handlers.batch.GunzipDecorator.append(GunzipDecorator.java:81)<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.collector.CollectorSin= k.append(CollectorSink.java:222)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 at com.cloudera.flume.core.connector.DirectDriv= er$PumperThread.run(DirectDriver.java:110)

=
Many thanks,

-Steve





--
= Thanks

Cameron Gandevia
--000e0cd61e3c085cb304af8511c0--