phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Leech <jonat...@gmail.com>
Subject Re: Phoenix has slow response times compared to HBase
Date Sat, 03 Sep 2016 05:57:22 GMT
The direct hbase client probably made 500 direct clients whereas Phoenix maybe made fewer simultaneous
calls, with a little waiting and hit a sweeter spot for load on your configuration.

> On Sep 2, 2016, at 7:06 PM, Mujtaba Chohan <mchohan@salesforce.com> wrote:
> 
> Single user average: Phoenix 8ms, HBase 5ms
> 50 users average: Phoenix 35ms, HBase 40ms
> 500 users average: Phoenix 300-400ms, HBase 350-450ms
> 
> Few notes:
> 
> * We have yet to identify why Phoenix was showing slight advantage with high number of
concurrent users from single client. 
> 
> * For the case with 500 concurrent users from single client, region server handler count
and Phoenix thread pool size was bumped to 500 to accommodate this level of concurrency.
> 
>> On Friday, September 2, 2016, James Taylor <jamestaylor@apache.org> wrote:
>> Thanks, Mujtaba. What's the average query time for HBase and Phoenix for the 1/50/500
simultaneous user scenarios?
>> 
>> Edu - make sure to set the UPDATE_CACHE_FREQUENCY property on the table (as Mujtaba
showed in his ALTER TABLE statement - you can do this in the CREATE TABLE statement as well).
>> 
>> Thanks,
>> James
>> 
>>> On Fri, Sep 2, 2016 at 5:40 PM, Mujtaba Chohan <mujtaba@apache.org> wrote:
>>> Here is the graph that I get simulating 1, 50 and 500 concurrent users from single
client. Query time for Phoenix is highly comparable with direct HBase gets. 
>>> 
>>> See the chart below with query time (ms) for random point gets over large table
that will not fit HBase block cache. Query/gets were executed for 1000 time for each user.
>>> 
>>> <image.png>
>>> Source code to execute gets/phoenix query simulating multiple users is at:
>>> ​
>>>  directhbasemt.java
>>> ​​
>>>  directphoenixmt.java
>>> ​
>>> Table DDL
>>> create table testuuid (k varchar not null primary key, a varchar, b varchar,
c varchar, d varchar, e varchar, f varchar);
>>> 
>>> alter table testuuid set "UPDATE_CACHE_FREQUENCY"=150000; // this restricts how
often server will check for metadata updates to improve performance
>>> 
>>> Table was filled with 68M rows.
>>> Phoenix 4.8/HBase 0.98.17 running on single machine.
>>> 
>>> //mujtaba
>>> 
>>> 
>>>> On Thu, Sep 1, 2016 at 3:34 AM, Narros, Eduardo (ELS-LON) <e.narros@elsevier.com>
wrote:
>>>> Hi Mujtaba,
>>>> 
>>>> 
>>>> See the answers inline below:
>>>> 
>>>> 
>>>> * How are you running Phoenix queries? We are using apache-jmeter and the
jdbc sampler.
>>>> * Were the concurrent Phoenix queries using the same JVM? Yes.
>>>> * Was the JVM restarted after changing number of concurrent users? Yes.
>>>> * Is the response time plotted when query is executed for the first time
or second or average of both? Average. We see response times ranging significantly even via
sqlline. i.e. the same query run 11 times sequentially takes anything between 17ms to around
489ms with no other load on the server.
>>>> * Is the UUID filtered on randomly distributed? Yes. 
>>>> * Does UUID match a single row? Yes.
>>>> * It seems that even non-concurrent Phoenix query which filters on UUID takes
500ms in your environment. Can you try the same query in Sqlline a few times and see how much
time it takes for each run? We run the same query 11 times via sqlline and these were the
response times:
>>>> 1 row selected (0.489 seconds)
>>>> 1 row selected (0.279 seconds)
>>>> 1 row selected (0.227 seconds)
>>>> 1 row selected (0.22 seconds)
>>>> 1 row selected (0.17 seconds)
>>>> 1 row selected (0.152 seconds)
>>>> 1 row selected (0.129 seconds)
>>>> 1 row selected (0.17 seconds)
>>>> 1 row selected (0.153 seconds)
>>>> 1 row selected (0.259 seconds)
>>>> 1 row selected (0.102 seconds)
>>>> 
>>>> * What is the explain plan for your Phoenix query? CLIENT 1-CHUNK PARALLEL
1-WAY ROUND ROBIN POINT LOOKUP ON 1 KEY OVER schema.DOCUMENTS
>>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS table
and reconnect Sqlline and execute the query again. I think the issue is that the response
times vary a lot, with 600 concurrent users the same query can take anything between 2ms to
10s.
>>>> * Can you share your table schema and how you ran Phoenix queries and your
HBase equivalent code? It is a simple table with 15 columns, the primary key is the uuid which
is of type VARCHAR(36). The hbase equivalent code is:
>>>>  HTableInterface hTable = pool.getTable("schema.DOCUMENTS");
>>>> 
>>>> Get get = new Get(toBytes(saltPrefix + uuid));
>>>> 
>>>> Result result = hTable.get(get);
>>>> 
>>>> * Any phoenix tuning defaults that you changed? No.
>>>> 
>>>> Kind Regards,
>>>> 
>>>> 
>>>> Edu
>>>> 
>>>> 
>>>> 
>>>>> On Wed, Aug 31, 2016 at 10:40 AM, Mujtaba Chohan <mujtaba@apache.org>
wrote:
>>>>> Something seems inherently wrong in these test results.
>>>>> 
>>>>> * How are you running Phoenix queries? Were the concurrent Phoenix queries
using the same JVM? Was the JVM restarted after changing number of concurrent users?
>>>>> * Is the response time plotted when query is executed for the first time
or second or average of both?
>>>>> * Is the UUID filtered on randomly distributed? Does UUID match a single
row?
>>>>> * It seems that even non-concurrent Phoenix query which filters on UUID
takes 500ms in your environment. Can you try the same query in Sqlline a few times and see
how much time it takes for each run?
>>>>> * If it's slow in Sqlline as well then try truncating your SYSTEM.STATS
>>>>> * Can you share your table schema and how you ran Phoenix queries and
your HBase equivalent code?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Wed, Aug 31, 2016 at 5:42 AM, Narros, Eduardo (ELS-LON) <e.narros@elsevier.com>
wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> 
>>>>>> We are exploring starting to use Phoenix and have done some load
tests to see whether Phoenix would scale. We have noted that compared to HBase, Phoenix response
times have a much slower average as the number of concurrent users increases. We are trying
to understand whether this is expected or there is something we are missing out.
>>>>>> 
>>>>>> 
>>>>>> This is the test we have performed:
>>>>>> 
>>>>>> Create table (20 columns) and load it with 400 million records indexed
via a column called 'uuid'.
>>>>>> Perform the following queries using 10,20,100,200,400 and 600 users
per second, each user will perform each query twice:
>>>>>> Phoenix: select * from schema.DOCUMENTS where uuid = ?
>>>>>> Phoenix: select /*+ SERIAL SMALL */* from schema.DOCUMENTS where
uuid = ?
>>>>>> Hbase equivalent to: select * from schema.DOCUMENTS where uuid =
?
>>>>>> The results are attached and they show that Phoenix response times
are at least an order of magnitude above those of HBase
>>>>>> The tests were run from the Master node of a CDH5.7.2 cluster with
Phoenix 4.7.0.
>>>>>> 
>>>>>> Are these test results expected?
>>>>>> 
>>>>>> Kind Regards,
>>>>>> 
>>>>>> Edu
>>>>>> 
>>>>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane,
Kidlington, Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England
and Wales.
>>>> 
>>>> 
>>>> Elsevier Limited. Registered Office: The Boulevard, Langford Lane, Kidlington,
Oxford, OX5 1GB, United Kingdom, Registration No. 1982084, Registered in England and Wales.

Mime
View raw message