phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abe Weinograd <...@flonet.com>
Subject Re: JDBC result iteration is slow
Date Mon, 10 Mar 2014 16:49:21 GMT
Hi James,

Thanks.  Here is the info you requested.  Additionally, I assumed it was a
client side thing because a COUNT(1) on the whole table is < 2sec after
rows are in the cache.  The first time running COUNT(1) is usually a bit
longer.  My table has about 3.7M rows in it.  A SELECT * (listing most
columns in the table) is what takes longer with the CPU spiking on the
client while the result set is being iterated over.

Thanks for your help.
Abe

- HBase version: 0.94.15 (CDH 4.6)
- Phoenix version: 2.2.3 (using tarball from
- size of cluster: 1 Master 4 RS (each 15GB of RAM, 4 Cores)
- setting for JVM max heap size: 4GiB
- create table statement: attached
- query: attached
- explain plan:
CLIENT PARALLEL 48-WAY FULL SCAN OVER MY_TABLE
    SERVER FILTER BY PageFilter 100000
CLIENT 100000 ROW LIMIT

- number of rows in table: 3.7 Million (just testing with this.  this will
be much larger over time)


On Mon, Mar 10, 2014 at 11:44 AM, James Taylor <jamestaylor@apache.org>wrote:

> Hi Abe,
> There's likely something wrong with your installation, as this is not
> expected behavior. Please let us know the following:
> - HBase version
> - Phoenix version
> - size of cluster
> - setting for JVM max heap size
> - create table statement
> - query
> - explain plan
> - number of rows in table
> Thanks,
> James
>
>
> On Monday, March 10, 2014, Abe Weinograd <abe@flonet.com> wrote:
>
>> I spent a little more time with this and am still unable to tune the
>> client properly.  I am testing using sqlline, Squirrel and just using the
>> JDBC driver in code.  I tried setting the hbase scanner caching in the JDBC
>> connection, in addition to putting it in the hbase-site.xml in the same dir
>> as the jar for sqlline.  I think my client is bottlenecked, partly cause
>> the CPU spikes and ~30 secs to retrieve 1,000 rows.
>>
>> I expect to retrieve a lot more than this in our use cases.  Is this a
>> tuning issue on my end or is this expected behavior.
>>
>> Thanks,
>> Abe
>>
>>
>> On Fri, Mar 7, 2014 at 10:19 AM, Abe Weinograd <abe@flonet.com> wrote:
>>
>>> Trying to pull around 100k rows through the JDBC driver.  I
>>> set hbase.client.scanner.caching to 10000 in the JDBC connection options.
>>>  Additionally, its very slow with even 1,000 rows (about 30 seconds to
>>> iterate over it).
>>>
>>> I assume this is a client side issue, but not sure what else I can tweak.
>>>
>>> Thanks,
>>> Abe
>>>
>>
>>

Mime
View raw message