You can definitely do this one of two ways. Syntactically, the way you get two logical nodes on a single physical node is as follows.

ln1 : tail(...) | someSink(...);
ln2 : tail(...) | someOtherSink(...);

map physicalNode ln1;
map physicalNode ln2;

You see that ln1 and 2 can have different (or the same) sources and / or sinks.

The other, more efficient, way of getting the same result is to use a fan out sink. This splits the data stream in two copies.

ln1 : tail(...) | [ sink1(...), sink2(...) ] ;

This sends the same data to both sinks 1 and 2 without doing twice the tail work. Check out the user guide for exact syntax examples, but you get the idea.

On Wed, Sep 14, 2011 at 6:03 AM, Abhishek Pathak <abhishek.pathak.iitk@gmail.com> wrote:

Is it possible for two logical nodes on the same physical node to support different configurations?
I am specifically interested in creating a parallel flow in addition to the existing flow.The existing flow rolls a new file every hour in the HDFS,tailing the log file.I would like the new flow to tail the same file,but roll a new file every minute in the local filesystem.I intend to use this flow for real-time monitoring and error prediction in logs.
If there is a better way to achieve the same result,i am very interested.


Eric Sammer
twitter: esammer
data: www.cloudera.com