phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Slow metadata update queries during upsert
Date Mon, 28 Mar 2016 14:58:30 GMT
Hi Ankur,
Try setting the UPDATE_CACHE_FREQUENCY on your table (4.7.0 or above) to
prevent the client from checking with the server on whether or not your
table metadata is up to date. See here[1] for more information. You can
issue a command like this which will hold on to your metadata on the client
for 15 minutes before checking back with the server to get metadata or
statistics updates on your table:

ALTER TABLE my_table SET UPDATE_CACHE_FREQUENCY=900000

Thanks,
James

[1] https://phoenix.apache.org/#Altering

On Mon, Mar 28, 2016 at 12:33 AM, Ankur Jain <ajain@quadanalytix.com> wrote:

> Hi
>
> We are using phoenix as our transactional data store(though we are not yet
> using its latest transaction feature yet). Earlier we had our own custom
> query layer built on top of hbase that we are trying to replace.
>
> During tests we found that inserts are very slow as compared to regular
> hbase puts. There is always 7-8ms of additional time associated with each
> upsert query. This time is taken mostly during validate phase, where the
> cache is updated with latest table metadata. Is there a way to avoid
> refresh of this cache always?
>
> Out of 15ms for a general upsert query in our case 11ms are taken to just
> update metadata cache of that table. Rest 3ms are spent in actual hbase
> batch call and 1ms in all other phoenix processing.
>
> We have two use cases,
> 1. Our table metadata is always static and we know we are not going to add
> any new columns at least on runtime.
>     we would like to avoid any cost of this metadata update cost so that
> our inserts are faster. Is this possible with existing code base.
>
> 2. We add columns to our tables on the fly.
>     Adding new columns on the fly is generally a rare event. Is there a
> control where we can explicitly invalidate cache, in case a column is
> updated and we are caching metadata infinitely.
>
> Is metadata cache at connection level or is at global level? Because we
> are aways creating new connections.
>
> I have also observed that CsvToKeyValueMapper is fast because it avoids
> connection.commit() step and do all the validations upfront to avoid update
> cache step during commit.
>
> Just to add another analysis where Phoenix inserts are much slower that
> native hbase put is https://issues.apache.org/jira/browse/YARN-2928. TimelineServiceStoragePerformanceTestSummaryYARN-2928.pdf
> clearly states that. I believe this might be related.
>
> Thanks,
> Ankur Jain
>

Mime
View raw message