flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Justin Workman <justinjwork...@gmail.com>
Subject hdfs.idleTime
Date Thu, 12 Jan 2017 19:20:14 GMT
sorry for cross posting to user and dev. I have recently set up a flume
configuration where we are using the regex_extractor interceptor to parse
the actual event date from the record flowing through the Flume source,
then using that date to build the HDFS sink bucket path. However, it
appears that the hdfs.idleTimeout value is not honored in this
configuration. It does work when using the timestamp interceptor you build
the output path.

I have set the hdfs.idleTimeout value for the HDFS sink, but the files are
never closed or renamed until I restart or shutdown Flume. Our flume is
configured to roll based on size or output path, and the files
rename/close/roll fine based on size, however the last file in each output
path is always left with the .tmp extension until we restart Flume. I would
expect that the file would be renamed and closed if there are no records
written to this file after the idleTimeout is reached.

Could I be missing something, or is this a known bug with the regex_extract
interceptor?

Thanks
Justin

Mime
View raw message