flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Ravindran <rahu...@yahoo.com>
Subject Re: IOException with HDFS-Sink:flushOrSync
Date Tue, 14 May 2013 02:23:32 GMT
We are using cdh 4.1.2 - Hadoop version 2.0.0. Looks like cdh 4.2.1 also uses the same Hadoop
version. Any suggestions on any mitigations?

Sent from my phone.Excuse the terseness.

On May 13, 2013, at 7:12 PM, Hari Shreedharan <hshreedharan@cloudera.com> wrote:

> What version of Hadoop are you using? Looks like you are getting hit by https://issues.apache.org/jira/browse/HADOOP-6762.
> 
> 
> Hari
> 
> -- 
> Hari Shreedharan
> 
> On Monday, May 13, 2013 at 6:50 PM, Matt Wise wrote:
> 
>> So we've just had this happen twice to two different flume machines... we're using
the HDFS sink as well, but ours is writing to an S3N:// URL. Both times our sink stopped working
and the filechannel clogged up immediately causing serious problems. A restart of Flume worked
-- but the filechannel was so backed up at that point that it took a good long while to get
Flume started up again properly.
>> 
>> Anyone else seeing this behavior?
>> 
>> (oh, and we're running flume 1.3.0)
>> On May 7, 2013, at 8:42 AM, Rahul Ravindran <rahulrv@yahoo.com> wrote:
>> 
>>> Hi,
>>>    We have noticed this a few times now where we appear to have an IOException
from HDFS and this stops draining the channel until the flume process is restarted. Below
are the logs: namenode-v01-00b is the active namenode (namenode-v01-00a is standby). We are
using Quorum Journal Manager for our Namenode HA, but there was no Namenode failover which
was initiated. If this is an expected error, should flume handle it and gracefully retry (thereby
not requiring a restart)?
>>> Thanks,
>>> ~Rahul.
>>> 
>>> 7 May 2013 06:35:02,494 WARN  [hdfs-hdfs-sink4-call-runner-2] (org.apache.flume.sink.hdfs.BucketWriter.append:378)
 - Caught IOException writing to HDFSWriter (IOException flush:java.io.IOException: Failed
on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host
is: "flumefs-v01-10a.a.com/10.40.85.170"; destination host is: "namenode-v01-00a.a.com":8020;
). Closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp)
and rethrowing exception.
>>> 07 May 2013 06:35:02,494 WARN  [hdfs-hdfs-sink4-call-runner-2] (org.apache.flume.sink.hdfs.BucketWriter.append:384)
 - Caught IOException while closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp).
Exception follows.
>>> java.io.IOException: IOException flush:java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170";
destination host is: "namenode-v01-00a.a.com":8020;
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1617)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1499)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1484)
>>>   at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
>>>   at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:345)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.access$500(BucketWriter.java:53)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:310)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:396)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
>>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>   at java.lang.Thread.run(Thread.java:662)
>>> 07 May 2013 06:35:02,495 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.HDFSEventSink.process:456)  - HDFS IO error
>>> java.io.IOException: IOException flush:java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170";
destination host is: "namenode-v01-00a.a.com":8020;
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1617)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1499)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1484)
>>>   at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
>>>   at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:345)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.access$500(BucketWriter.java:53)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:310)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.append(BucketWriter.java:396)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:729)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$2.call(HDFSEventSink.java:727)
>>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>   at java.lang.Thread.run(Thread.java:662)
>>> 07 May 2013 06:35:05,350 WARN  [hdfs-hdfs-sink1-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter.append:378)
 - Caught IOException writing to HDFSWriter (IOException flush:java.io.IOException: Failed
on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host
is: "flumefs-v01-10a.a.com/10.40.85.170"; destination host is: "namenode-v01-00b.a.com":8020;
). Closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-1//event.1367891734999.tmp)
and rethrowing exception.
>>> 07 May 2013 06:35:05,351 WARN  [hdfs-hdfs-sink1-call-runner-5] (org.apache.flume.sink.hdfs.BucketWriter.append:384)
 - Caught IOException while closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-1//event.1367891734999.tmp).
Exception follows.
>>> java.io.IOException: IOException flush:java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170";
destination host is: "namenode-v01-00b.a.com":8020;
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1617)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1499)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1484)
>>>   at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
>>>   at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:345)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.access$500(BucketWriter.java:53)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:310)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:743)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:741)
>>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>   at java.lang.Thread.run(Thread.java:662)
>>> 07 May 2013 06:35:05,352 WARN  [SinkRunner-PollingRunner-DefaultSinkProcessor]
(org.apache.flume.sink.hdfs.HDFSEventSink.process:456)  - HDFS IO error
>>> java.io.IOException: IOException flush:java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170";
destination host is: "namenode-v01-00b.a.com":8020;
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1617)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1499)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1484)
>>>   at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:116)
>>>   at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:95)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:345)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.access$500(BucketWriter.java:53)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:310)
>>>   at org.apache.flume.sink.hdfs.BucketWriter$4.run(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:143)
>>>   at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:308)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:743)
>>>   at org.apache.flume.sink.hdfs.HDFSEventSink$3.call(HDFSEventSink.java:741)
>>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>   at java.lang.Thread.run(Thread.java:662)
>>> 07 May 2013 06:35:07,497 WARN  [hdfs-hdfs-sink4-call-runner-8] (org.apache.flume.sink.hdfs.BucketWriter.append:378)
 - Caught IOException writing to HDFSWriter (IOException flush:java.io.IOException: Failed
on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host
is: "flumefs-v01-10a.a.com/10.40.85.170"; destination host is: "namenode-v01-00a.a.com":8020;
). Closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp)
and rethrowing exception.
>>> 07 May 2013 06:35:07,497 WARN  [hdfs-hdfs-sink4-call-runner-8] (org.apache.flume.sink.hdfs.BucketWriter.append:384)
 - Caught IOException while closing file (hdfs://nameservice1/user/br/data_platform/eventstream/event/flumefs-v01-10a-4//event.1367891734983.tmp).
Exception follows.
>>> java.io.IOException: IOException flush:java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is: "flumefs-v01-10a.a.com/10.40.85.170";
destination host is: "namenode-v01-00a.a.com":8020;
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1617)
>>>   at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1499)
> 

Mime
View raw message