Hello Jeff, 

   Thanks for the reply.  My use case is not really special.  We have multiple products and each product emits traditional log messages in different servers.  I would like to stream those into HDFS.  The logs are generally in apache or log4j format.  
   So, I have many sources from where I want to stream the logs into HDFS.   I can have a channel/collector machine where I install flume.   I guess, my question is, do I need to install flume on the servers where the log messages lie and do I need to install flume in HDFS namenode too?

Thanks,
- Seshu  

On Wed, Feb 6, 2013 at 7:47 PM, Jeff Lord <jlord@cloudera.com> wrote:
Seshu,

It really is going to depend on your use case.
Though it sounds that you may need to run an agent on each of the source machines.
Which source do you plan to use? It may also be the case that you can use the flume rpc client to write data directly from your application to the flume collector machine.

http://flume.apache.org/FlumeDeveloperGuide.html#rpc-client-interface

-Jeff


On Wed, Feb 6, 2013 at 4:49 PM, Seshu V <sesh12@gmail.com> wrote:
Hi All,

    I have used Flume 0.9.3 a while back, it worked fine at that time.  Now, I am looking to use 'Flume NG', started reading documentation today.  In Flume 0.9.3, I installed flume agents on the servers wherever I had the data source.   And, I had a collector machine separately.  My sink was HDFS.   I see that Flume NG is using Channel.    
    My question is that I have multiple source servers and my sink is HDFS.  I also have another machine for Channel (collector in old days).   Do I need to install flume NG  in all the source machines and Channel machine?  Or can I install flume NG only on the Channel server and (somehow) specify in the configuration to pull data from source machines and specify the sink as HDFS?
     Thanks in advance for your replies..

Thanks,
- Seshu