flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Percy <mpe...@apache.org>
Subject Re: .tmp in hdfs sink
Date Tue, 20 Nov 2012 19:16:41 GMT
FLUME-1660 is now committed and it will be in 1.3.0. In the case where you
are using 1.2.0, I suggest running with hdfs.rollInterval set so the files
will roll normally.


On Thu, Nov 15, 2012 at 11:23 PM, Juhani Connolly <
juhani_connolly@cyberagent.co.jp> wrote:

>  I am actually working on a patch for exactly this, refer to FLUME-1660
> The patch is on review board right now, I fixed a corner case issue that
> came up with unit testing, but the implementation is not really to my
> satisfaction. If you are interested please have a look and add your opinion.
> https://issues.apache.org/jira/browse/FLUME-1660
> https://reviews.apache.org/r/7659/
> On 11/16/2012 01:16 PM, Mohit Anchlia wrote:
> Another question I had was about rollover. What's the best way to rollover
> files in reasonable timeframe? For instance our path is YY/MM/DD/HH so
> every hour there is new file and the -1 hr is just sitting with .tmp and it
> takes sometimes even hour before .tmp is closed and renamed to .snappy. In
> this situation is there a way to tell flume to rollover files sooner based
> on some idle time limit?
> On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:
>> Thanks Mike it makes sense. Anyway I can help?
>> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy <mpercy@apache.org> wrote:
>>> Hi Mohit, this is a complicated issue. I've filed
>>> https://issues.apache.org/jira/browse/FLUME-1714 to track it.
>>>  In short, it would require a non-trivial amount of work to implement
>>> this, and it would need to be done carefully. I agree that it would be
>>> better if Flume handled this case more gracefully than it does today.
>>> Today, Flume assumes that you have some job that would go and clean up the
>>> .tmp files as needed, and that you understand that they could be partially
>>> written if a crash occurred.
>>>  Regards,
>>> Mike
>>> On Sun, Nov 11, 2012 at 8:32 AM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:
>>>> What we are seeing is that if flume gets killed either because of
>>>> server failure or other reasons, it keeps around the .tmp file. Sometimes
>>>> for whatever reasons .tmp file is not readable. Is there a way to rollover
>>>> .tmp file more gracefully?

View raw message