phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Shah <amits...@gmail.com>
Subject Speeding Up Group By Queries
Date Fri, 25 Mar 2016 13:56:23 GMT
Hi,

I am trying to evaluate apache hbase (version 1.0.0) and phoenix (version
4.6) deployed through cloudera for our OLAP workfload. I have a table that
has 10 mil rows. I try to execute the below roll up query and it takes
around 2 mins to return 1,850 rows.

SELECT SUM(UNIT_CNT_SOLD), SUM(TOTAL_SALES) FROM TRANSACTIONS GROUP BY
T_COUNTRY;

I tried applying the "joining with indices" example given on the website
<https://phoenix.apache.org/joins.html> on the TRANSACTIONS table by
creating an index on the grouped by column as below but that doesn't help.

CREATE INDEX TRANSACTIONS_COUNTRY_INDEX ON TRANSACTIONS (T_COUNTRY) INCLUDE
(UNIT_CNT_SOLD, TOTAL_SALES);

This index is not getting used when the query is executed. The query plan
is as below

+------------------------------------------+
|                   PLAN                   |
+------------------------------------------+
| CLIENT 31-CHUNK PARALLEL 31-WAY FULL SCAN OVER TRANSACTIONS |
|     SERVER AGGREGATE INTO DISTINCT ROWS BY [T_COUNTRY] |
| CLIENT MERGE SORT                        |
+------------------------------------------+

Theoretically can secondary indexes help improve the performance of group
by queries?

Any suggestions on what are different options in phoenix I could try out to
speed up GROUP BY queries?

Thanks,
Amit.

Mime
View raw message