flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gonzalo Herreros <gherre...@gmail.com>
Subject Re: Wrong disk space in HDFS
Date Wed, 07 Oct 2015 15:13:33 GMT
OS disk space is usually freed later after you delete files in hdfs (unless
it needs it now), check the available space on the hfds console to see if
there if space is freed
Hdfs allocates blocks, not space, and doesn't matter if you kill the
process that requested the blocks

Regards
Gonzalo
On Oct 7, 2015 3:07 PM, "Carlos Rojas Matas" <cmatas@despegar.com> wrote:

> Hi guys,
>
> we're facing a problem with HDFS Sink. We're rolling files in an daily
> basis into HDFS and after a while we're receiving lack of space warnings.
> Then we restart the HDFS cluster and the available space gets reported fine
> again. Even if we kill the agent the remaining space stills wrong.  It's
> like flume is reserving space and in some way not releasing it afterwards.
> Even if we span df or du SO commands the space seems to be reporting wrong.
>
> The configuration we're using is as follows:
>
> ##sink
> p13nAgent.sinks.hdfsSink.type=hdfs
> p13nAgent.sinks.hdfsSink.hdfs.minBlockReplicas=1
> p13nAgent.sinks.hdfsSink.channel=mainChannel
> p13nAgent.sinks.hdfsSink.hdfs.fileType=DataStream
> p13nAgent.sinks.hdfsSink.hdfs.filePrefix=$HOST_PREFIX
> p13nAgent.sinks.hdfsSink.hdfs.fileSuffix=.avro
> p13nAgent.sinks.hdfsSink.hdfs.path=$HDFS_PATH/p13n-storage/$ENV/%{topic}/%Y/%m/%d
>
> #The roll size must be 297M because the snappy compression rate is aprox 43%
> p13nAgent.sinks.hdfsSink.hdfs.rollSize	=	312134251
> p13nAgent.sinks.hdfsSink.hdfs.rollCount	=	0
> p13nAgent.sinks.hdfsSink.hdfs.rollInterval	=	0
> p13nAgent.sinks.hdfsSink.hdfs.idleTimeout	=	300
> p13nAgent.sinks.hdfsSink.hdfs.maxOpenFiles	=	1000
> #p13nAgent.sinks.hdfsSink.hdfs.round	=	true
> #p13nAgent.sinks.hdfsSink.hdfs.roundUnit	=	hour
> #p13nAgent.sinks.hdfsSink.hdfs.roundValue	=	24
> #p13nAgent.sinks.hdfs.hdfs.writeFormat=Text
> p13nAgent.sinks.hdfs.hdfs.batchSize=5000
> p13nAgent.sinks.hdfsSink.serializer=com.despegar.p13n.flume.avro.serializer.FlumeAvroEventSerializer$Builder
> p13nAgent.sinks.hdfsSink.serializer.compressionCodec=snappy
>
>
> Any clue will be welcomed.
>
> Thanks in advance,
>
> -carlos.
>
>

Mime
View raw message