flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jayant Shekhar <jay...@cloudera.com>
Subject Re: how flume identifies a file transfer is complete or not
Date Fri, 25 Jul 2014 21:53:39 GMT
Hi Anand,

+1 to Natty.

Also, if you are looking to move files into HDFS, check out Spooling
Directory Source.



On Fri, Jul 25, 2014 at 10:52 AM, Jonathan Natkins <natty@streamsets.com>

> Hi Anand,
> What you're doing is a slightly odd way to use Flume. With the exec
> source, Flume will execute that command, and consume the output as events.
> Often the exec source is used to tail -F a file, which allows you to pipe
> more data to the file and ingest additional events. By using cat, Flume
> will cat the file, but then the source will become useless, because the
> command will have finished, and there's no way that I'm aware of to get an
> agent to start a new command. By using tail -F, the command persists, and
> if you do `ps aux | grep flume`, you would see a running tail -F command.
> As for figuring out when the transfer is complete, I don't think there's a
> really good way other than checking the file itself, or looking to see if
> the cat command is still running.
> Does that help?
> Thanks,
> Natty
> On Thu, Jul 24, 2014 at 2:00 AM, Anandkumar Lakshmanan <anand@orzota.com>
> wrote:
>> Hi,
>> I am new to flume.
>> I am doing cat a file using exec source into hdfs.
>> While running it manually, I am able to see the file transferred
>> completely. But still flume in is running state.
>> How do I find when the complete transfer would be done.
>> Example:
>> My flume.conf
>> myAgent.sources.mySource.type = exec
>> myAgent.sources.mySource.command = cat /home/haas/file2.txt
>> And checking the transfer is complete or not, only by typing the
>> following command manually by comparing the file size.
>> hadoop fs -ls /user/flumedata/
>> Is there a way to know when the transfer is get completed?
>> Thanks.
>> Anand

View raw message