[sending to flume-user@incubator.apache.org, bcc flume-dev@cloudera.org]


How are you sending data from tail to mongo?  My guess is that you have an agent setup in E2E mode and then a collector that doesn't hvae a collector sink or collector wrapping mongo.

source tail 
sink agentSink(xxxx)

source: collectorSink
sink:  mongo

If this is the case, you need to wrap you mongo with a collector sink so that acks get sent to tell the agent to stop resending data.

collector's sink should be: collector(30000) { mongoSink() } 

On Mon, Sep 5, 2011 at 11:02 PM, metamoi <metamoi@gmail.com> wrote:
I use the following command:
tail("/var/log/flume/test.log", startFromEnd="true")

On Sep 6, 2:58 pm, metamoi <meta...@gmail.com> wrote:
> I set an agent, which sent a new record per minute.
> After five minutes, the agent sent five record to a collector, which
> stored these data on the mongodb.
> I think that there are five records in the collection (table in mysql)
> of mongodb.
> But there are 15 records in it.
> At first insertion, there is only one record after a minute.
> next, though after two minutes, agent sent another new record, there
> are two records including first record.
> So, there are three records in the collection of mongodb.
> In like manner, after five minutes, there are five records including
> previous four records.
> In sum, 1 + 2 + 3 + 4 + 5 = 15 records are stored in the db.
> Is this a bug of flume?
> There is anyone who ever met this kind of problem?
> Thanks in advance.

// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com