flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From no jihun <jees...@gmail.com>
Subject Re: File left as OPEN_FOR_WRITE state.
Date Wed, 20 Jul 2016 08:01:58 GMT
@chirs If you meant hdfs.callTimeout
Now I am doing a test on that.

I can increase the value.
When timeout occur while close, It will never retried? ( as logs above )

2016-07-20 16:50 GMT+09:00 Chris Horrocks <chris@hor.rocks>:

> Have you tried increasing the HDFS sink timeouts?
>
> --
> Chris Horrocks
>
>
> On Wed, Jul 20, 2016 at 8:03 am, no jihun <'jeesim2@gmail.com'> wrote:
>
> Hi.
>
> I found some files on hdfs left as OPEN_FOR_WRITE state.
>
> *This is flume's log about the file.*
>
>
> 01  18 7 2016 16:12:02,765 INFO
>>  [SinkRunner-PollingRunner-DefaultSinkProcessor]
>> (org.apache.flume.sink.hdfs.BucketWriter.open:234)
>
> 02 - Creating 1468825922758.avro.tmp
>
>
>> 03  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter$5.call:429)
>
> 04 - Closing idle bucketWriter 1468825922758.avro.tmp at 1468826559812
>
>
>> 05  18 7 2016 16:22:39,812 INFO  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter.close:363)
>
> 06 - Closing 1468825922758.avro.tmp
>
>
>> 07  18 7 2016 16:22:49,813 WARN  [hdfs-hdfs2-roll-timer-0]
>> (org.apache.flume.sink.hdfs.BucketWriter.close:370)
>
> 08 - failed to close() HDFSWriter for file (1468825922758.avro.tmp).
>> Exception follows.
>
> 09 java.io.IOException: Callable timed out after 10000 ms on file:
>> 1468825922758.avro.tmp
>
>
>> 10  18 7 2016 16:22:49,816 INFO  [hdfs-hdfs2-call-runner-7]
>> (org.apache.flume.sink.hdfs.BucketWriter$8.call:629)
>
> 11 - Renaming 1468825922758.avro.tmp to 1468825922758.avro
>
>
> - seems close never retried
> - flume just renamed which still opened.
>
>
> *2 day later I've found that file by this command*
>
> hdfs fsck /data/flume -openforwrite | grep "OPENFORWRITE" | grep
>> "2016/07/18" | sed 's//data/flume// /data/flume//g' | grep -v ".avro.tmp" |
>> sed -n 's/.*(/data/flume/.*avro).*/ /p'
>
>
>
> *So, reverseLease-ed*
>
> hdfs debug recoverLease -path 1468825922758.avro -retries 3
>> recoverLease returned false.
>> Retrying in 5000 ms...
>> Retry #1
>> recoverLease SUCCEEDED on 1468825922758.avro
>
>
>
> *My hdfs sink configuration*
>
> hadoop2.sinks.hdfs2.type = hdfs
>> hadoop2.sinks.hdfs2.channel = fileCh1
>> hadoop2.sinks.hdfs2.hdfs.fileType = DataStream
>> hadoop2.sinks.hdfs2.serializer = ....
>> hadoop2.sinks.hdfs2.serializer.compressionCodec = snappy
>> hadoop2.sinks.hdfs2.hdfs.filePrefix = %{type}_%Y-%m-%d_%{host}
>> hadoop2.sinks.hdfs2.hdfs.fileSuffix = .avro
>> hadoop2.sinks.hdfs2.hdfs.rollInterval = 3700
>> #hadoop2.sinks.hdfs2.hdfs.rollSize = 67000000
>> hadoop2.sinks.hdfs2.hdfs.rollSize = 800000000
>> hadoop2.sinks.hdfs2.hdfs.rollCount = 0
>> hadoop2.sinks.hdfs2.hdfs.batchSize = 10000
>> hadoop2.sinks.hdfs2.hdfs.idleTimeout = 300
>
>
> hdfs.closeTries, retryInterval both not set.
>
>
> *My question is  *
> why '1468825922758.avro' left OPEN_FOR_WRITE? even though renamed to
> .avro succesufully.
> Is this expected behavior? so , what should I do to eliminate these
> anomal OPENFORWRITE files?
>
> Regards,
> Jihun.
>
>


-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Mime
View raw message