flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mingjie Lai <mjla...@gmail.com>
Subject Re: Use metadata field for output bucketing
Date Tue, 08 Nov 2011 18:20:09 GMT

As I said, you need to use escapedFormatDfs. For your case, it should be:

collector(3600000){ 
escapedFormatDfs("s3n://mybucket/flume/%{source}/%Y-%m-%d", "log-%{host}-")}

According to the user guide, escapedFormatDfs will help to escape the 
%{source} string.

... The hdfspath can use escape sequences documented to bucket data as 
documented in the Output Bucketing section...

But I haven't tried s3 as sink. However it should work. Can you have a try?


On 11/08/2011 12:44 AM, Shuang wrote:
> Thanks, Mingjie. In my case, I already have the "source" field in event
> metadata, does that mean I can do the following directly?
> collectorSink("s3n://mybucket/flume/%{source}/%Y-%m-%d/\",
> \"log-%{host}-\", 3600000)
>
> basically, use %{source} to refer to that metadata field in the S3 path?
>
> Shuang
>
> On Fri, Nov 4, 2011 at 5:02 PM, Mingjie Lai <mjlai09@gmail.com
> <mailto:mjlai09@gmail.com>> wrote:
>
>
>     You can try escapedFormatDfs. Here is example:
>
>     $ bin/flume node_nowatch -n f1 -c 'f1: text("/tmp/aa.txt") |
>     value("src", 123)
>     collector(2000){__escapedFormatDfs("file:///tmp"__, "aaaa-%{src}" )};'
>
>     $ ls /tmp
>     aaaa-123
>     ...
>
>     Should also work for s3.
>
>     Mingjie
>
>
>     On 11/04/2011 03:17 PM, Shuang wrote:
>
>         Hi, guys,
>            After reading the Flume User Guide, I thought this is
>         possible, but
>         would like to confirm with your guys. Currently, I have collectors
>         configured as:
>         collectorSink("s3n://mybucket/__flume/%Y-%m-%d/\",\"log-%{__host}-\",
>         3600000),
>         and I have a field called "source" in Flume event's metadata
>         table, and
>         would like to use in the collectorSink path, something like this:
>         collectorSink("s3n://mybucket/__flume/%{metadata['source']}/%__Y-%m-%d/\",\"log-%{host}-\",
>         3600000),
>
>         I wonder what's the right syntax to refer to field in the metadata
>         table. I search the user guide and couldn't find any example.
>         Also I'd
>         like to point out, this is kind of similar to how Scribe's message
>         category is used.
>
>         Shuang
>
>

Mime
View raw message