From user-return-492-apmail-phoenix-user-archive=phoenix.apache.org@phoenix.incubator.apache.org Sun Apr 6 21:08:46 2014 Return-Path: X-Original-To: apmail-phoenix-user-archive@minotaur.apache.org Delivered-To: apmail-phoenix-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B56E11519 for ; Sun, 6 Apr 2014 21:08:46 +0000 (UTC) Received: (qmail 63969 invoked by uid 500); 6 Apr 2014 21:08:45 -0000 Delivered-To: apmail-phoenix-user-archive@phoenix.apache.org Received: (qmail 63901 invoked by uid 500); 6 Apr 2014 21:08:45 -0000 Mailing-List: contact user-help@phoenix.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@phoenix.incubator.apache.org Delivered-To: mailing list user@phoenix.incubator.apache.org Received: (qmail 63893 invoked by uid 99); 6 Apr 2014 21:08:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 06 Apr 2014 21:08:45 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amits@infolinks.com designates 207.126.144.133 as permitted sender) Received: from [207.126.144.133] (HELO mail-ig0-f171.google.com) (207.126.144.133) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 06 Apr 2014 21:08:39 +0000 Received: from mail-ig0-f171.google.com ([209.85.213.171]) (using TLSv1) by eu1sys200aob112.postini.com ([207.126.147.11]) with SMTP ID DSNKU0HCQStF+jKXJAC6GzsuAywfO/EMOieJ@postini.com; Sun, 06 Apr 2014 21:08:18 UTC Received: by mail-ig0-f171.google.com with SMTP id c1so2902127igq.16 for ; Sun, 06 Apr 2014 14:08:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=Uo49XyptpNlnokvfyFojsXDcRAouH3uGbLP3kDO7BdI=; b=fC7RdY2yEnvR298juWE9cufbhmuAtaOdi2r6RYSuFD5eoF15A96oe2nq1vmR+Ao737 Qd3ZE4xOmmz88OgE7fP6X/8wsUq7sS20T3wGU3t92HGTF5P3LwLyhEOYTmQkU+JkBkQ3 +S1t7X4bOcEUQSSJ3tkgCT6gE5iBx4NVhEAiKJHIVMc4pf/wv9axHZl9GHhuV+osJIIw fd3SCXp2e9putiHmVU7aMdFdiD5Zb9esmxk+vCz1hUVs0yYMJqS6fKVtUEsI9k7QSJ0V +v3585xzEdr2ayIJOg3UdaV69Jzr0Wiaapsg433Ug9iCjoqNZoanKD3opS+wdqEwwdrj 8Elg== X-Gm-Message-State: ALoCoQlCJ+NttoOoq6vf6GkumgEajt5fEvqmVZP9D9cgoc/ZhYBFjjcsESM45+wjEkBYMvBhATwrrEB8Y+AKHsvUFOM6J3nSMQwss1pbJ7KVd6T/snN2v+e5IZ5kqAv32odnQm/fLr5Al5soBpq5VmNTRFLA8iTcEg== X-Received: by 10.50.119.229 with SMTP id kx5mr16306433igb.17.1396818496657; Sun, 06 Apr 2014 14:08:16 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.50.119.229 with SMTP id kx5mr16306424igb.17.1396818496528; Sun, 06 Apr 2014 14:08:16 -0700 (PDT) Received: by 10.64.225.41 with HTTP; Sun, 6 Apr 2014 14:08:16 -0700 (PDT) Received: by 10.64.225.41 with HTTP; Sun, 6 Apr 2014 14:08:16 -0700 (PDT) In-Reply-To: References: Date: Mon, 7 Apr 2014 00:08:16 +0300 Message-ID: Subject: Re: Best way to execute batch of queries From: Amit Sela To: user@phoenix.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113494204d40ba04f666267e X-Virus-Checked: Checked by ClamAV on apache.org --001a113494204d40ba04f666267e Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable HBase 0.94.12 (+Hadoop 1.0.4) On Apr 6, 2014 11:35 PM, "James Taylor" wrote: > What version of HBase are you on? > > > On Sun, Apr 6, 2014 at 1:22 PM, Amit Sela wrote: > >> The CREATE TABLE statement is a bit long but it looks something like thi= s: >> *CREATE TABLE SCHEMA.TABLE (rowkey varchar not null, F1.C1 >> unsigned_long,F1.C2 unsigned_long,...,F1.Cn unsigned_long,F2.C1 >> unsigned_long,F2.C2 unsigned_long,...,F2.Cm unsigned_long CONSTRAINT pk >> PRIMARY KEY (rowkey)) COMPRESSION=3D'SNAPPY',VERSIONS=3D1,TTL=3D15811200= * >> Where n=3D24m and n+m ~ 10000. (daily buckets in one family and hourly i= n >> the other). >> The explain select is: >> *explain select >> h0_b5,h1_b5,h2_b5,h3_b5,h4_b5,h5_b5,h6_b5,h7_b5,h8_b5,h9_b5,h10_b5,h11_b= 5,h12_b5,h13_b5,h14_b5,h15_b5,h16_b5,h17_b5,h18_b5,h19_b5,h20_b5,h21_b5,h22= _b5,h23_b5,h0_b10,h1_b10,h2_b10,h3_b10,h4_b10,h5_b10,h6_b10,h7_b10,h8_b10,h= 9_b10,h10_b10,h11_b10,h12_b10,h13_b10,h14_b10,h15_b10,h16_b10,h17_b10,h18_b= 10,h19_b10,h20_b10,h21_b10,h22_b10,h23_b10,h0_b15,h1_b15,h2_b15,h3_b15,h4_b= 15,h5_b15,h6_b15,h7_b15,h8_b15,h9_b15,h10_b15,h11_b15,h12_b15,h13_b15,h14_b= 15,h15_b15,h16_b15,h17_b15,h18_b15,h19_b15,h20_b15,h21_b15,h22_b15,h23_b15,= h0_b20,h1_b20,h2_b20,h3_b20,h4_b20,h5_b20,h6_b20,h7_b20,h8_b20,h9_b20,h10_b= 20,h11_b20,h12_b20,h13_b20,h14_b20,h15_b20,h16_b20,h17_b20,h18_b20,h19_b20,= h20_b20,h21_b20,h22_b20,h23_b20,h0_b25,h1_b25,h2_b25,h3_b25,h4_b25,h5_b25,h= 6_b25,h7_b25,h8_b25,h9_b25,h10_b25,h11_b25,h12_b25,h13_b25,h14_b25,h15_b25,= h16_b25,h17_b25,h18_b25,h19_b25,h20_b25,h21_b25,h22_b25,h23_b25,h0_b30,h1_b= 30,h2_b30,h3_b30,h4_b30,h5_b30,h6_b30,h7_b30,h8_b30,h9_b30,h10_b30,h11_b30,= h12_b30,h13_b30,h14_b30,h15_b30,h16_b30,h17_b30,h18_b30,h19_b30,h20_b30,h21= _b30,h22_b30,h23_b30,h0_b35,h1_b35,h2_b35,h3_b35,h4_b35,h5_b35,h6_b35,h7_b3= 5,h8_b35,h9_b35,h10_b35,h11_b35,h12_b35,h13_b35,h14_b35,h15_b35,h16_b35,h17= _b35,h18_b35,h19_b35,h20_b35,h21_b35,h22_b35,h23_b35,h0_b40,h1_b40,h2_b40,h= 3_b40,h4_b40,h5_b40,h6_b40,h7_b40,h8_b40,h9_b40,h10_b40,h11_b40,h12_b40,h13= _b40,h14_b40,h15_b40,h16_b40,h17_b40,h18_b40,h19_b40,h20_b40,h21_b40,h22_b4= 0,h23_b40,h0_b45,h1_b45,h2_b45,h3_b45,h4_b45,h5_b45,h6_b45,h7_b45,h8_b45,h9= _b45,h10_b45,h11_b45,h12_b45,h13_b45,h14_b45,h15_b45,h16_b45,h17_b45,h18_b4= 5,h19_b45,h20_b45,h21_b45,h22_b45,h23_b45,h0_b50,h1_b50,h2_b50,h3_b50,h4_b5= 0,h5_b50,h6_b50,h7_b50,h8_b50,h9_b50,h10_b50,h11_b50,h12_b50,h13_b50,h14_b5= 0,h15_b50,h16_b50,h17_b50,h18_b50,h19_b50,h20_b50,h21_b50,h22_b50,h23_b50 >> from adserver.inventory_snappy where rowkey in >> ('20140101_US____','20140102_US____','20140103_US____','20140104_US____'= ,'20140105_US____','20140106_US____','20140107_US____','20140108_US____','2= 0140109_US____','20140110_US____','20140111_US____','20140112_US____','2014= 0113_US____','20140114_US____','20140115_US____','20140116_US____','2014011= 7_US____','20140118_US____','20140119_US____','20140120_US____','20140121_U= S____','20140122_US____','20140123_US____','20140124_US____','20140125_US__= __','20140126_US____','20140127_US____','20140128_US____','20140129_US____'= ,'20140130_US____','20140131_US____')* >> >> The explain result is: >> *CLIENT PARALLEL 23-WAY SKIP SCAN ON 31 KEYS OVER >> ADSERVER.INVENTORY_SNAPPY ['20140101_US____'] - ['20140131_US____']* >> >> Thanks, >> Amit. >> >> >> On Sun, Apr 6, 2014 at 7:16 PM, James Taylor wro= te: >> >>> Hi Amit, >>> Would it be possible to post your CREATE TABLE statement, the EXPLAIN o= n >>> the query, and the query itself so we can better diagnose any issues? >>> >>> From your description above, you're executing the right query - it's >>> always better to query for multiple rows versus executing a single quer= y >>> for each row (this is typically the case for any query engine). >>> >>> Batching is implemented in 2.2.3 and above, but not for performance >>> reasons. Some products for which we have integration rely on a JDBC dri= ver >>> to support batching, so that's why it was added. Batching does not buy = you >>> anything in Phoenix, because it's an embedded JDBC driver. >>> >>> Thanks, >>> James >>> >>> >>> On Sun, Apr 6, 2014 at 9:08 AM, Amit Sela wrote: >>> >>>> Hi all, >>>> I'm running with Phoenix 2.2.2 which (AFAIK) does not support batch >>>> queries. >>>> I want to query some (not all) of the columns in the table (all from >>>> the same family), for multiple rowkeys. >>>> Normally I would execute a batch queries of: >>>> *select c1,c2... from table where rowkey=3Drow1* >>>> *select c1,c2... from table where rowkey=3Drow2* >>>> *...* >>>> Since batch is not supported, I do the following: >>>> *select c1,c2... from table where rowkey in (row1,row2...)* >>>> >>>> This ends up being slower than executing a batch of gets from the HBas= e >>>> API... >>>> >>>> Is there a better way for me to use Phoenix in this case ? >>>> >>>> Thanks, >>>> Amit. >>>> >>>> >>>> >>> >> > --001a113494204d40ba04f666267e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

HBase 0.94.12 (+Hadoop 1.0.4)

On Apr 6, 2014 11:35 PM, "James Taylor"= ; <jamestaylor@apache.org&= gt; wrote:
What version of HBase are you on?


On Sun, Apr 6, 2014 at 1:22 PM, = Amit Sela <amits@infolinks.com> wrote:
The CREATE TABLE statement = is a bit long but it looks something like this:
CREATE TABLE SCHEMA.= TABLE (rowkey varchar not null, F1.C1 unsigned_long,F1.C2 unsigned_long,...= ,F1.Cn unsigned_long,F2.C1 unsigned_long,F2.C2 unsigned_long,...,F2.Cm unsi= gned_long=A0CONSTRAINT pk PRIMARY KEY (rowkey)) COMPRESSION=3D'SNAPPY&#= 39;,VERSIONS=3D1,TTL=3D15811200
Where n=3D24m and n+m ~ 10000. (daily buckets in one family and hourly= in the other).
The explain select is:
explain selec= t h0_b5,h1_b5,h2_b5,h3_b5,h4_b5,h5_b5,h6_b5,h7_b5,h8_b5,h9_b5,h10_b5,h11_b5= ,h12_b5,h13_b5,h14_b5,h15_b5,h16_b5,h17_b5,h18_b5,h19_b5,h20_b5,h21_b5,h22_= b5,h23_b5,h0_b10,h1_b10,h2_b10,h3_b10,h4_b10,h5_b10,h6_b10,h7_b10,h8_b10,h9= _b10,h10_b10,h11_b10,h12_b10,h13_b10,h14_b10,h15_b10,h16_b10,h17_b10,h18_b1= 0,h19_b10,h20_b10,h21_b10,h22_b10,h23_b10,h0_b15,h1_b15,h2_b15,h3_b15,h4_b1= 5,h5_b15,h6_b15,h7_b15,h8_b15,h9_b15,h10_b15,h11_b15,h12_b15,h13_b15,h14_b1= 5,h15_b15,h16_b15,h17_b15,h18_b15,h19_b15,h20_b15,h21_b15,h22_b15,h23_b15,h= 0_b20,h1_b20,h2_b20,h3_b20,h4_b20,h5_b20,h6_b20,h7_b20,h8_b20,h9_b20,h10_b2= 0,h11_b20,h12_b20,h13_b20,h14_b20,h15_b20,h16_b20,h17_b20,h18_b20,h19_b20,h= 20_b20,h21_b20,h22_b20,h23_b20,h0_b25,h1_b25,h2_b25,h3_b25,h4_b25,h5_b25,h6= _b25,h7_b25,h8_b25,h9_b25,h10_b25,h11_b25,h12_b25,h13_b25,h14_b25,h15_b25,h= 16_b25,h17_b25,h18_b25,h19_b25,h20_b25,h21_b25,h22_b25,h23_b25,h0_b30,h1_b3= 0,h2_b30,h3_b30,h4_b30,h5_b30,h6_b30,h7_b30,h8_b30,h9_b30,h10_b30,h11_b30,h= 12_b30,h13_b30,h14_b30,h15_b30,h16_b30,h17_b30,h18_b30,h19_b30,h20_b30,h21_= b30,h22_b30,h23_b30,h0_b35,h1_b35,h2_b35,h3_b35,h4_b35,h5_b35,h6_b35,h7_b35= ,h8_b35,h9_b35,h10_b35,h11_b35,h12_b35,h13_b35,h14_b35,h15_b35,h16_b35,h17_= b35,h18_b35,h19_b35,h20_b35,h21_b35,h22_b35,h23_b35,h0_b40,h1_b40,h2_b40,h3= _b40,h4_b40,h5_b40,h6_b40,h7_b40,h8_b40,h9_b40,h10_b40,h11_b40,h12_b40,h13_= b40,h14_b40,h15_b40,h16_b40,h17_b40,h18_b40,h19_b40,h20_b40,h21_b40,h22_b40= ,h23_b40,h0_b45,h1_b45,h2_b45,h3_b45,h4_b45,h5_b45,h6_b45,h7_b45,h8_b45,h9_= b45,h10_b45,h11_b45,h12_b45,h13_b45,h14_b45,h15_b45,h16_b45,h17_b45,h18_b45= ,h19_b45,h20_b45,h21_b45,h22_b45,h23_b45,h0_b50,h1_b50,h2_b50,h3_b50,h4_b50= ,h5_b50,h6_b50,h7_b50,h8_b50,h9_b50,h10_b50,h11_b50,h12_b50,h13_b50,h14_b50= ,h15_b50,h16_b50,h17_b50,h18_b50,h19_b50,h20_b50,h21_b50,h22_b50,h23_b50 fr= om adserver.inventory_snappy where rowkey in ('20140101_US____',= 9;20140102_US____','20140103_US____','20140104_US____',= '20140105_US____','20140106_US____','20140107_US____= 9;,'20140108_US____','20140109_US____','20140110_US____= ','20140111_US____','20140112_US____','20140113_US_= ___','20140114_US____','20140115_US____','20140116_= US____','20140117_US____','20140118_US____','201401= 19_US____','20140120_US____','20140121_US____','201= 40122_US____','20140123_US____','20140124_US____','= 20140125_US____','20140126_US____','20140127_US____',&#= 39;20140128_US____','20140129_US____','20140130_US____'= ,'20140131_US____')

The explain result is:
CLIENT= PARALLEL 23-WAY SKIP SCAN ON 31 KEYS OVER ADSERVER.INVENTORY_SNAPPY ['= 20140101_US____'] - ['20140131_US____']

Thanks,
Amit.


On Sun, Apr 6, 2014 at 7:16 PM, Ja= mes Taylor <jamestaylor@apache.org> wrote:
Hi Amit,
Would it be possible to post= your CREATE TABLE statement, the EXPLAIN on the query, and the query itsel= f so we can better diagnose any issues?

From your description above, you're executing the r= ight query - it's always better to query for multiple rows versus execu= ting a single query for each row (this is typically the case for any query = engine).

Batching is implemented in 2.2.3 and above, but not for= performance reasons. Some products for which we have integration rely on a= JDBC driver to support batching, so that's why it was added. Batching = does not buy you anything in Phoenix, because it's an embedded JDBC dri= ver.

Thanks,
James


On Sun, Apr 6, 2014 at = 9:08 AM, Amit Sela <amits@infolinks.com> wrote:
Hi all,=A0
I'm running with Phoen= ix 2.2.2 which (AFAIK) does not support batch queries.
I want to query some (not all) of the columns in the table (all from the sa= me family), for multiple rowkeys.
Normally I would execute a batch queries of:
se= lect c1,c2... from table where rowkey=3Drow1
select c1,c2.= .. from table where rowkey=3Drow2
...
Since batch is not supported, I do the following:
select = c1,c2... from table where rowkey in (row1,row2...)
This ends up being slower than executing a batch of gets fr= om the HBase API...

Is there a better way for me to use Phoenix in this cas= e ?

Thanks,
Amit.





--001a113494204d40ba04f666267e--