flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Shreedharan <hshreedha...@cloudera.com>
Subject Re: Flafka: how to differentiate the unfinished .tmp file and the abandoned one
Date Thu, 09 Jul 2015 01:10:31 GMT
If you kill the agent (not a kill -9) the temp files will be renamed (we
wait for a while for rename to be completed), so it should not happen. But
if you do a kill -9, there is not a whole lot we can do on the flume side.
If you notice a file not being written to for a while after a restart, just
rename it via the hdfs command.

On Wednesday, July 8, 2015, Jun MA <mj.saber1990@gmail.com> wrote:

> Hello Community,
> I’m using Flafka (Kafka channel and HDFS sink). I met an awkward problem
> that I don’t know how to determinate if a .tmp file is being written or it
> is been abandoned? If sink is writing events to a file, it will have a
> postfix .tmp, but if the agent goes down (control + d) while writing to
> that file, it will not rename the file but left it with .tmp postfix. When
> restart the agent, it will not do anything to that .tmp file. But the
> events in that .tmp file is not redundant because at the kafka channel
> side, the offset is already committed.
> So my question is that if there is a way to differentiate the working .tmp
> file and the finished .tmp file?
> Thanks,
> Jun



View raw message