From Jeff Hansen <dsche...@gmail.com>
Subject Fwd: Flumes DSL syntax
Date Thu, 08 Sep 2011 16:11:23 GMT
Is there any clear concise and up to date documentation on the full
DSL syntax?  It seems a lot of things are deprecated, but still work,
but as a result you see a lot of warning messages saying you should be
using the new syntax (without mentioning what it is...)
I'm hitting a lot of problems with the syntax -- mainly for two reasons.
1. It's not always the same -- exec config token (optional token)
token token versus exec multiconfig 'token:token|token'
2. The syntax is largely described with Extended Backus Naur Format --
using control characters which are part of the DSL grammar (without
clearly distinguishing if they're intended as part of the DSL grammar,
or simply as part of the EBNF description of the DSL grammar).
3. (Ok, I know I said "mainly for two reasons", but this third one
contributes as well...) Quotes.  The DSL grammar requires quotes in a
number of places, however many of the commands require that some
string of the DSL be wrapped in quotes -- and sometimes you have to
wrap that command in quotes.  Imagine the following:

    $ flume shell -c host -e "exec multiconfig 'node : source |
decorator(" -- doh! If only I had a triple quote character...
    $ flume shell -c host -e "exec config node source 'decorator(" --
doh! that won't work either, guess I'll have to put this in a script
file and use -s...

Here's a nice example of reason number 2 straight from the
CollectorSink class.  First I got this warning message:

    Deprecated syntax: Expected a format spec but instead had a
(String) avrojson

I understand it's safe to ignore this warning, but I wanted to know
what the appropriate new syntax was (because I couldn't figure it out
from the documentation) so I tried various alternatives.  Here's an
error message that I thought was somewhat helpful:

    usage: collectorSink[(dfsdir,path[,rollmillis], format)]

Except that 'collectorSink[("hdfs://namenode/flume","helloworld",1,avrojson)]
throws a syntax error (something about a lexer error at token '(' --
so not very helpful).  I realized my error though -- those brackets in
the message were not meant to indicate that brackets are part of the
syntax, but rather that the entire argument clause is optional.  By
the way, I've had no luck optionally leaving off the rollmillis option
-- it always says I need that numeric value.

Please don't get me wrong -- I'm liking the flume tool, I just wish it
wasn't such a matter of trial and error to figure out how to use it.
Is there any good go to guide on basic DSL syntax?  Is the ANTLR
grammar file really the closest thing there is?  Even that isn't
really complete because not everything is parsed with ANTLR.


