flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: Issues with .tmp files in HDFS
Date Thu, 06 Oct 2011 21:12:46 GMT
Jonathan:

You can also use a path filter.

http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/fs/FileSystem.html#listStatus(org.apache.hadoop.fs.Path,
org.apache.hadoop.fs.PathFilter)

In fact, you should, otherwise you could be moving files you aren't aware of
(like tmp files).

On Wed, Oct 5, 2011 at 10:00 AM, Jonathan <jonny2112@gmail.com> wrote:

> Hey experts,
>
> I have flume writing to a directory in hdfs. I then fire off a cron job to
> move that data into hive every five minutes. The problem that I am having is
> that the .tmp files are also moved and start causing errors on the collector
> that is writing the files to hdfs. Is there any way to get rid of the .tmp
> files or to have them in a different directory then the other files? Any
> other suggestions on how I can work around this issue?
>
> Jonathan
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Mime
View raw message