phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Leech <jonat...@gmail.com>
Subject Re: Phoenix Slow Problem
Date Tue, 01 Nov 2016 03:34:38 GMT
Make sure number of regions is at least number of physical disks on cluster, if not split or
salt. Do the math based on row size and target performance on disk throughput, number of regions
etc. If necessary, add servers or disks. Also look at hbase cache settings, JVM heap sizes,
GC settings etc. Depending on the data, compression can improve performance. Snappy typically
does less compression than gzip but at less CPU cost. Gzip can get pretty high ratios but
writes are more costly than reads, so major compactions get get backed up. For typical data
both will likely increase read throughout. Depending on how often rows are updated, removed,
added, change default hbase major compaction interval, or force major compaction after large
updates. Also, unless counting rows is your use case, don't worry about how long it takes
to count them. Base expectations on expected use cases. With the overhead of Phoenix query
parsing, threading in the client, etc etc you probably won't do much better than sub-second
on aggregates on 1+ mil rows.

> On Oct 31, 2016, at 5:19 PM, Fawaz Enaya <m.fawaz.enaya@gmail.com> wrote:
> 
> Thanks for your answer but why it gives 1 way parallel and can not be more?
> 
>> On Sunday, 30 October 2016, Mich Talebzadeh <mich.talebzadeh@gmail.com> wrote:
>> If you create a secondary index in Phoenix on the table on single or selected columns,
that index (which will be added to Hbase) will be used to return data. For example in below
MARKETDATAHBASE_IDX1 is an index on table MARKETDATAHBASE and is used by the query
>> 
>> 
>>  0: jdbc:phoenix:rhes564:2181> EXPLAIN select count(1) from MARKETDATAHBASE;
>> +--------------------------------------------------------------------+
>> |                                PLAN                                |
>> +--------------------------------------------------------------------+
>> | CLIENT 1-CHUNK PARALLEL 1-WAY FULL SCAN OVER MARKETDATAHBASE_IDX1  |
>> |     SERVER FILTER BY FIRST KEY ONLY                                |
>> |     SERVER AGGREGATE INTO SINGLE ROW                               |
>> +--------------------------------------------------------------------+
>> 
>> HTH
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> http://talebzadehmich.wordpress.com
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage
or destruction of data or any other property which may arise from relying on this email's
technical content is explicitly disclaimed. The author will in no case be liable for any monetary
damages arising from such loss, damage or destruction.
>>  
>> 
>>> On 30 October 2016 at 11:42, Fawaz Enaya <m.fawaz.enaya@gmail.com> wrote:
>>> Hi All in this great project,
>>> 
>>> 
>>> I have an HBase cluster of four nodes, I use Phoenix to access HBase, but I do
not know why its too much slow to execute SELECT count(*) for table contains 5 million records
it takes 8 seconds.
>>> Below is the explain for may select statement
>>> CLIENT 6-CHUNK 9531695 ROWS 629145639 BYTES PARALLEL 1-WAY FULL SCAN OVER TABLE
|
>>> 
>>> |     SERVER FILTER BY FIRST KEY ONLY                                       
  |
>>> 
>>> |     SERVER AGGREGATE INTO SINGLE ROW 
>>> 
>>> Anyone can help.
>>> 
>>> Many Thanks
>>> --
>>> Thanks & regards,
>>> 
>> 
> 
> 
> -- 
> --
> Thanks & regards,
> 
> 

Mime
View raw message