flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Snehal Nagmote <nagmote.sne...@gmail.com>
Subject Re: Flume HDFS Sink Issue (IO Exception - hdfs.DFSClient$DFSOutputStream.sync)
Date Tue, 26 Nov 2013 19:57:37 GMT
Sorry forgot to mention , We are using Hadoop 1.2.0


On 26 November 2013 11:07, Snehal Nagmote <nagmote.snehal@gmail.com> wrote:

> Hello All,
>
> We are using HDFS sink with Flume and it goes into HDFS IO Exception very
> often .
>
> I am using apache Flume HDP 1.4.0. we have two tier topology and Collector
> is not on datanode ,Collector fails often and it
> throws  java.io.IOException: DFSOutputStream is closed
>
> java.io.IOException: DFSOutputStream is closed
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:4097)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:4084)
>  at
> org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
> at org.apache.flume.sink.hdfs.HDFSDataStream.sync(HDFSDataStream.java:117)
>  at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:356)
> at org.apache.flume.sink.hdfs.BucketWriter$5.call(BucketWriter.java:353)
>  at org.apache.flume.sink.hdfs.BucketWriter$8$1.run(BucketWriter.java:536)
> at
> org.apache.flume.sink.hdfs.BucketWriter.runPrivileged(BucketWriter.java:160)
>  at
> org.apache.flume.sink.hdfs.BucketWriter.access$1000(BucketWriter.java:56)
> at org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:533)
>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>  at java.lang.Thread.run(Thread.java:662)
>
> This is how configuration looks like
>
>
> agent.sinks.hdfs-sink.type = hdfs
> agent.sinks.hdfs-sink.hdfs.filePrefix = %Y%m%d%H-events-1
> agent.sinks.hdfs-sink.hdfs.path = hdfs://
> bi-hdnn01.sjc.kixeye.com:8020/flume/logs/%Y%m%d/%H/
> agent.sinks.hdfs-sink.hdfs.fileSuffix = .done
> agent.sinks.hdfs-sink.hdfs.fileType =DataStream
> agent.sinks.hdfs-sink.hdfs.writeFormat = Text
> agent.sinks.hdfs-sink.hdfs.rollInterval = 0
> agent.sinks.hdfs-sink.hdfs.rollSize = 0
> agent.sinks.hdfs-sink.hdfs.rollCount = 0
> agent.sinks.hdfs-sink.hdfs.batchSize = 10000
> agent.sinks.hdfs-sink.hdfs.threadsPoolSize=10000
> agent.sinks.hdfs-sink.hdfs.rollTimerPoolSize=10
> agent.sinks.hdfs-sink.hdfs.callTimeout = 500000
>
>
> Earlier , I was using rollInterval=30 , I changed it to 0 because of above
> exception and then I started seeing new exception.
>
>  Failed to renew lease for [DFSClient_NONMAPREDUCE_1307546979_31] for 30
> seconds.  Will retry shortly ...
> java.io.IOException: Call to bi-hdnn01.sjc.kixeye.com/10.54.208.14:8020failed on local
exception: java.io.IOException:
>
> Caused by: java.io.IOException: Connection reset by peer
>
>
> Because of these exception , our production downstream process gets lot
> slower and need frequent restarts and upstream process fills channels ,
> Does anyone know , what could be the cause and how we can avoid this ?
>
> Any thoughts would be really helpful , its been extremely difficult to
> debug this
>
>
> Thanks,
> Snehal
>
>
>
>
>
>

Mime
View raw message