flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Sarva <csa...@evidon.com>
Subject Re: Issues with .tmp files in HDFS
Date Thu, 06 Oct 2011 20:02:51 GMT
I'm not sure why, but we don't get any .tmp files when writing to HDFS.
Which version of Flume are you using? Either way, you could do something
like this:

hadoop fs -ls /collect/ | awk '{print $8}' | egrep -v tmp$ | xargs -in 1
hadoop fs -mv {} /hive/

might be a bit slower but it should work..

On Wed, Oct 5, 2011 at 1:00 PM, Jonathan <jonny2112@gmail.com> wrote:

> Hey experts,
> I have flume writing to a directory in hdfs. I then fire off a cron job to
> move that data into hive every five minutes. The problem that I am having is
> that the .tmp files are also moved and start causing errors on the collector
> that is writing the files to hdfs. Is there any way to get rid of the .tmp
> files or to have them in a different directory then the other files? Any
> other suggestions on how I can work around this issue?
> Jonathan

View raw message