From flume-user-return-371-apmail-incubator-flume-user-archive=incubator.apache.org@incubator.apache.org Tue Oct 18 04:59:45 2011 Return-Path: X-Original-To: apmail-incubator-flume-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-flume-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 34C187329 for ; Tue, 18 Oct 2011 04:59:45 +0000 (UTC) Received: (qmail 62617 invoked by uid 500); 18 Oct 2011 04:59:44 -0000 Delivered-To: apmail-incubator-flume-user-archive@incubator.apache.org Received: (qmail 61365 invoked by uid 500); 18 Oct 2011 04:59:42 -0000 Mailing-List: contact flume-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: flume-user@incubator.apache.org Delivered-To: mailing list flume-user@incubator.apache.org Received: (qmail 61052 invoked by uid 99); 18 Oct 2011 04:59:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Oct 2011 04:59:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mailtokamal@gmail.com designates 209.85.213.175 as permitted sender) Received: from [209.85.213.175] (HELO mail-yx0-f175.google.com) (209.85.213.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Oct 2011 04:59:33 +0000 Received: by yxo30 with SMTP id 30so217161yxo.6 for ; Mon, 17 Oct 2011 21:59:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=meMpAN+Su+8tvTvuUN1HI0es/qJc7AdpnTkbzEglWmE=; b=o8RHmCS1b2Zwz56Yf67DFyLnXynWa+68Gt1FYHSg+92VDC6gNJ0M3OspYpdAgkj6tN P5I/TjwcZo9UPgCO7AUMJJSYR08Lmz95sW7dyZ/AK1LgxqmK/D7U1s370a6ZGemKGn3p +6EyYBFszdAp8z+JPgnVverBfGvewOJmEsMQE= Received: by 10.182.217.105 with SMTP id ox9mr341479obc.45.1318913953127; Mon, 17 Oct 2011 21:59:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.187.68 with HTTP; Mon, 17 Oct 2011 21:58:53 -0700 (PDT) In-Reply-To: References: From: Kamal Bahadur Date: Mon, 17 Oct 2011 21:58:53 -0700 Message-ID: Subject: Re: Cassandra Sink using Hector To: flume-user@incubator.apache.org Content-Type: multipart/alternative; boundary=f46d04447311aa5e1904af8b95af X-Virus-Checked: Checked by ClamAV on apache.org --f46d04447311aa5e1904af8b95af Content-Type: text/plain; charset=UTF-8 Hi Dani, Thanks for the reply. I am using E2E relaibility mode. If I spawn new thread for each append call, I am not sure if the acks will be handled properly. I might lose an event if the child thread ends up in an exception. Do you have any suggestion for my use case? With current setup, I am able to write only 500 events per second. The expected events rate is over 2000 per second. I tried to increase the number of collectors and it seems to help. Is this my only option? Thanks, Kamal On Mon, Oct 17, 2011 at 4:42 PM, Dani Rayan wrote: > Hey Kamal, > > You are correct. The append method would not spawn new threads by itself. > However, you can still override it. > > > On Mon, Oct 17, 2011 at 1:58 PM, Kamal Bahadur wrote: > >> Hi, >> >> I have written a sink for writing data into Casandra using Hector API. It >> looks like Hector does a great job of connection pooling and load balancing. >> As soon as I start the collector, I can see 16 conections being established >> between collector and cassandra. I am not sure if flume is taking advantage >> of those connections in the pool. I am assuming that, Collector's append >> method is not multi-threaded and therefore only one connection is being used >> at any point of time. Can someone confirm this? >> >> Thanks, >> Kamal >> > > > > -- > -Dani Abel Rayan > --f46d04447311aa5e1904af8b95af Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Dani,

Thanks for the reply. I am using E2E relaibility mode. If I= spawn new thread for each append call, I am not sure if the acks will be h= andled properly. I might lose an event if the child thread ends up in an ex= ception. Do you have any suggestion for my use case? With current setup, I = am able to write only 500 events per second. The expected events rate is ov= er 2000 per second. I tried to increase the number of collectors and it see= ms to help. Is this my only option?

Thanks,
Kamal

On Mon, Oct 17, 2011= at 4:42 PM, Dani Rayan <dani.rayan@gmail.com> wrote:
Hey Kamal,

You are correct. The append method would not spawn new th= reads by itself. However, you can still override it.


On Mon, Oct 17, 2011 at 1:58 = PM, Kamal Bahadur <mailtokamal@gmail.com> wrote:
Hi,

I have written a sink for writin= g data into Casandra using Hector API. It looks like Hector does a great jo= b of connection pooling and load balancing. As soon as I start the collecto= r, I can see 16 conections being established between collector and cassandr= a. I am not sure if flume is taking advantage of those connections in the p= ool. I am assuming that, Collector's append method is not multi-threade= d and therefore only one connection is being used at any point of time. Can= someone confirm this?

Thanks,
Kamal



--
-Dani Abel Rayan

--f46d04447311aa5e1904af8b95af--