phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: querying time for Apache Phoenix
Date Tue, 26 Jul 2016 15:39:46 GMT
Hi Irina,

I'd recommend trying the following:
- set the UPDATE_CACHE_FREQUENCY=60000 property when you create your table
and index to prevent extra RPCs at query time.
- if you're querying for a single row, use the serial and small hints like
this: /*+ SERIAL SMALL */
- though not strictly necessary, try using the index hint like this:
/*+ INDEX(my_table my_index) */
- use PreparedStatement to prevent extra parsing

Have you tried other types of queries too that do aggregation, topN, range
scans, sorts, etc?

In the next release, we'll work on having better default values for these
as well as driving them in a cost-based manner.

Thanks,
James



On Tue, Jul 26, 2016 at 8:07 AM, Placinta, Irina (ELS) <
i.placinta@elsevier.com> wrote:

> Hi,
>
>
> We are interested in querying performance of Phoenix on small to large
> datasets. We have Apache Phoenix installed on an EMR with 5 instances.
>
>
> The querying times we get are slow compared to the equivalent query in
> hbase, for example:
>
>
> Table Documents with primary key UUID and index on profile_id
>
>
>
> Apache Phoenix Hbase
> 400k rows dataset: select * from documents where uuid = 10-a
> 0.25 sec 0.02 sec
> 400k rows dataset: select profile_id from documents where uuid = 10-a
> 0.20 sec 0.02 sec
>
> Hbase seems 10x faster than Phoenix, is there some tuning we can do to
> achieve better results?
> We are querying the DB programatically (scala) & also using the
> client sqlline.
>
> Thank you!
>

Mime
View raw message