flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zijad Purkovic <zijadpurko...@gmail.com>
Subject Re: Collector Sink writing to HDFS
Date Thu, 19 Jan 2012 21:54:46 GMT
If youre using default flume-site.xml, it will open a new file every
30 seconds for writing to HDFS. So if your file takes longer than that
to read, send to collector, acknowledge and write to HDFS youre gonna
end up with more that one file on HDFS.

On Thu, Jan 19, 2012 at 10:08 PM, Gaurav Khanna <khannapost@yahoo.com> wrote:
> Hi,
> A newbie question - perhaps it has already been answered and so apologize in
> advance in that case.
> collectorSink("/tmp/bb/%H00/", "%{host}-")
> Used the above collector sink to read one file (around 78 M) and it wrote 4
> files into the /tmp/bb folder. I was wondering why that was done by the
> collector sink and what is the rationale behind that?
> Thanks
> Gaurav
> Gaurav Khanna

Zijad Purković

View raw message